Re: [R] Easy 'apply' question
Perfect, thanks Josh! Cheers, A 2011/3/10 Joshua Wiley > Dear Aaron, > > The problem is not with your function, but using apply(). Look at the > "Details" section of ?apply You will see that if the data is not an > array or matrix, apply will coerce it to one (or try). Now go over to > the "Details" section of ?matrix and you will see that matrices can > only contain a single class of data and that this follows a hierarchy. > In short, your data frame is coerced to a data frame and the classes > are all coerced to the highest---character. You can use lapply() > instead to get your desired results. Here is an example: > > ## Construct (named) test dataframe > tf <- data.frame(x = 1:3, y = 4:6, z = c("A","A","A")) > > ## Show why what you tried did not work > (test <- apply(tf, 2, class)) > > ## using lapply() > (test <- lapply(tf, function(x) { > if(is.numeric(x)) mean(x) else unique(x)[1]})) > > > Hope this helps, > > Josh > > On Thu, Mar 10, 2011 at 5:11 PM, Aaron Polhamus > wrote: > > Dear list, > > > > I couldn't find a solution for this problem online, as simple as it > seems. > > Here's the problem: > > > > > > #Construct test dataframe > > tf <- data.frame(1:3,4:6,c("A","A","A")) > > > > #Try the apply function I'm trying to use > > test <- apply(tf,2,function(x) if(is.numeric(x)) mean(x) else > unique(x)[1]) > > > > #Look at the output--all columns treated as character columns... > > test > > > > #Look at the format of the original data--the first two columns are > > integers. > > str(tf) > > > > > > In general terms, I want to differentiate what function I apply over a > > row/column based on what type of data that row/column contains. Here I > want > > a simple mean if the column is numeric, and the first unique value if the > > column is a character column. As you can see, 'apply' treats all columns > as > > characters the way I've written his function. > > > > Any thoughts? Many thanks in advance, > > Aaron > > > >[[alternative HTML version deleted]] > > > > ______ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ > -- Aaron Polhamus NASA Jet Propulsion Lab Statistical consultant, Revolution Analytics 160 E Corson Street Apt 207, Pasadena, CA 91103 Cell: +1 (206) 380.3948 Email: [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Easy 'apply' question
Dear list, I couldn't find a solution for this problem online, as simple as it seems. Here's the problem: #Construct test dataframe tf <- data.frame(1:3,4:6,c("A","A","A")) #Try the apply function I'm trying to use test <- apply(tf,2,function(x) if(is.numeric(x)) mean(x) else unique(x)[1]) #Look at the output--all columns treated as character columns... test #Look at the format of the original data--the first two columns are integers. str(tf) In general terms, I want to differentiate what function I apply over a row/column based on what type of data that row/column contains. Here I want a simple mean if the column is numeric, and the first unique value if the column is a character column. As you can see, 'apply' treats all columns as characters the way I've written his function. Any thoughts? Many thanks in advance, Aaron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difficult with round() function
Dear list, I'm writing a function to re-grid a data set from finer to coarser resolutions in R as follows (I use this function with sapply/apply): gridResize <- function(startVec = stop("What's your input vector"), to = stop("Missing 'to': How long do you want the fnial vector to be?")){ from <- length(startVec) shortVec<-numeric() tics <- from*to for(j in 1:to){ interval <- ((j/to)*tics - (1/to)*tics + 1):((j/to)*tics) benchmarks <- interval/to #FIRST RUN ASSUMES FINAL BENCHMARK/TO IS AN INTEGER... positions <- which(round(benchmarks) == benchmarks) indeces <- benchmarks[positions] fracs <- numeric() #SINCE MUCH OF THE TIME THIS WILL NOT BE THE CASE, THIS SCRIPT DEALS WITH THE REMAINDER... for(i in 1:length(positions)){ if(i == 1) fracs[i] <- positions[i]/length(benchmarks) else{ fracs[i] <- (positions[i] - sum(positions[1:(i-1)]))/length(benchmarks) } } #AND UPDATES STARTVEC INDECES AND FRACTION MULTIPLIERS if(max(positions) != length(benchmarks)) indeces <- c(indeces, max(indeces) + 1) if(sum(fracs) != 1) fracs <- c(fracs, 1 - sum(fracs)) fromVals <- startVec[indeces] if(any(is.na(fromVals))){ NAindex <- which(is.na(fromVals)) if(sum(Fracs[-NAindex]) >= 0.5) shortVec[j] <- sum(fromVals*fracs, na.rm=TRUE) else shortVec[j] <- NA }else{shortVec[j] <- sum(fromVals*fracs)} } return(shortVec) } for the simple test case test <- gridResize(startVec = c(2,4,6,8,10,8,6,4,2), to = 7) the function works fine. For larger vectors, however, it breaks down. E.g.: test <- gridResize(startVec = rnorm(300, 9, 20), to = 200) This returns the error: Error in positions[1:(i - 1)] : only 0's may be mixed with negative subscripts and the problem seems to be in the line positions <- which(round(benchmarks) == benchmarks). In this particular example the code cracks up at j = 27. When set j = 27 and run the calculation manually I discover the following: > benchmarks[200] [1] 40 > benchmarks[200] == 40 [1] FALSE > round(benchmarks[200]) == 40 [1] TRUE Even though my benchmark calculation seems to be returning a clean integers to serve as inputs for the creation of the 'positions' variable, for whatever reason R doesn't read it that way. I would be very grateful for any advice on how I can either alter my approach entirely (I am sure there is a far more elegant way to regrid data in R) or a simple fix for this rounding error. Many thanks in advance, Aaron -- Aaron Polhamus Statistical consultant, Revolution Analytics MSc Applied Statistics, The University of Oxford, 2009 838a NW 52nd St, Seattle, WA 98107 Cell: +1 (206) 380.3948 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q
Ivan and Michael, Many thanks for the tips, those solved my queries. Still interested in how to force custom functions to work over rows rather than columns when using apply, but the MAT/rowSums(MAT) technique is definitely the most efficient way to go for this application. Cheers, Aaron 2010/12/2 Michael Bedward > Hi Aaron, > > Following up on Ivan's suggestion, if you want the column order to > mirror the row order... > > mo <- order(rowSums(MAT), decreasing=TRUE) > MAT2 <- MAT[mo, mo] > > Also, you don't need all those extra c() calls when creating > inputData, just the outermost one. > > Regarding your second question, your statements... > > TMAT <- apply(t(MAT), 2, function(X) X/sum(X)) > TMAT <- t(TMAT) > > is actually just a complicated way of doing this... > > TMAT <- MAT / rowSums(MAT) > > You can confirm that by doing it your way and then this... > > TMAT == MAT / rowSums(MAT) > > ...and you should see a matrix of TRUE values > > Michael > > > On 2 December 2010 20:43, Ivan Calandra > wrote: > > Hi, > > > > Here is a not so easy way to do your first step, but it works: > > MAT2 <- cbind(MAT, rowSums(MAT)) > > MAT[order(MAT2[,6], decreasing=TRUE),] > > > > For the second, I don't know! > > > > HTH, > > Ivan > > > > > > Le 12/2/2010 09:46, Aaron Polhamus a écrit : > >> > >> Greetings, > >> > >> My goal is to create a Markov transition matrix (probability of moving > >> from > >> one state to another) with the 'highest traffic' portion of the matrix > >> occupying the top-left section. Consider the following sample: > >> > >> inputData<- c( > >> c(5, 3, 1, 6, 7), > >> c(9, 7, 3, 10, 11), > >> c(1, 2, 3, 4, 5), > >> c(2, 4, 6, 8, 10), > >> c(9, 5, 2, 1, 1) > >> ) > >> > >> MAT<- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE) > >> colnames(MAT)<- c("A", "B", "C", "D", "E") > >> rownames(MAT)<- c("A", "B", "C", "D", "E") > >> > >> rowSums(MAT) > >> > >> I wan to re-arrange the elements of this matrix such that the elements > >> with > >> the largest row sums are placed to the top-left, in descending order. > Does > >> this make sense? In this case the order I'm looking for would be B, D, > A, > >> E, > >> C Any thoughts? > >> > >> As an aside, here is the function I've written to construct the > transition > >> matrix. Is there a more elegant way to do this that doesn't involve a > >> double > >> transpose? > >> > >> TMAT<- apply(t(MAT), 2, function(X) X/sum(X)) > >> TMAT<- t(TMAT) > >> > >> I tried the following: > >> > >> TMAT<- apply(MAT, 1, function(X) X/sum(X)) > >> > >> But my the custom function is still getting applied over the columns of > >> the > >> array, rather than the rows. For a check try: > >> > >> rowSums(TMAT) > >> colSums(TMAT) > >> > >> Row sums here should equal 1... > >> > >> Many thanks in advance, > >> Aaron > >> > >>[[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > -- > > Ivan CALANDRA > > PhD Student > > University of Hamburg > > Biozentrum Grindel und Zoologisches Museum > > Abt. Säugetiere > > Martin-Luther-King-Platz 3 > > D-20146 Hamburg, GERMANY > > +49(0)40 42838 6231 > > ivan.calan...@uni-hamburg.de > > > > ** > > http://www.for771.uni-bonn.de > > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- Aaron Polhamus Statistical consultant, Revolution Analytics MSc Applied Statistics, The University of Oxford, 2009 838a NW 52nd St, Seattle, WA 98107 Cell: +1 (206) 380.3948 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arrange elements on a matrix according to rowSums + short 'apply' Q
Greetings, My goal is to create a Markov transition matrix (probability of moving from one state to another) with the 'highest traffic' portion of the matrix occupying the top-left section. Consider the following sample: inputData <- c( c(5, 3, 1, 6, 7), c(9, 7, 3, 10, 11), c(1, 2, 3, 4, 5), c(2, 4, 6, 8, 10), c(9, 5, 2, 1, 1) ) MAT <- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE) colnames(MAT) <- c("A", "B", "C", "D", "E") rownames(MAT) <- c("A", "B", "C", "D", "E") rowSums(MAT) I wan to re-arrange the elements of this matrix such that the elements with the largest row sums are placed to the top-left, in descending order. Does this make sense? In this case the order I'm looking for would be B, D, A, E, C Any thoughts? As an aside, here is the function I've written to construct the transition matrix. Is there a more elegant way to do this that doesn't involve a double transpose? TMAT <- apply(t(MAT), 2, function(X) X/sum(X)) TMAT <- t(TMAT) I tried the following: TMAT <- apply(MAT, 1, function(X) X/sum(X)) But my the custom function is still getting applied over the columns of the array, rather than the rows. For a check try: rowSums(TMAT) colSums(TMAT) Row sums here should equal 1... Many thanks in advance, Aaron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.