Re: [R] rowSums problem
This is precisely what I needed; I can't believe how simple it is. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/rowSums-problem-tp4632405p4632461.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums problem
I am having a problem visualizing what you are doing here. What I see is a temp file of 760 elements. From my point of view and reading in your data it would have a have dim(760, 1) but your code seems to suggest that it is not a single vector. Again I must be missing something completely as I would have thought that the output would have been a vector. Could exporting the csv files have somehow mixed up the data structure that you are working with? You might want to provide the two data files using the dput() command. John Kane Kingston ON Canada > -Original Message- > From: vashchyshy...@gmail.com > Sent: Tue, 5 Jun 2012 07:48:51 -0700 (PDT) > To: r-help@r-project.org > Subject: [R] rowSums problem > > I'm having a very frustrating problem, trying to find the inverse > distance > squared weighted interpolants of some weather data. > > I have a data frame of weights, which sum to 1. I have attached the > weights > data. I also have a data frame of temperatures at 48 grid points, which I > have also attached. > > Now, all I need to do is multiply all of the rows of the temperature data > frame by the weights (element-wise), and sum across the columns. > > However, when I try to use the most obvious approach, > > temp3880W <- weight3880*temp[,(3:50)] > temp3880W <- rowsum(temp3880W) > > > I get the wrong result: > > > head(temp3880W) > 1 2 3 4 5 6 > -0.4904454 -1.2728543 -1.5360133 -0.2687030 62.3048012 6.2610305 > > > > I've only been successful by using a for loop which is far too slow: > > temp3880 <- rep(0,length(temp$Year)) > > for (i in 1:length(temp$Year)) { > wmul <- weight3880*as.vector(temp[i,(3:50)]) > temp3880[i] <- sum(wmul) > } > > > This gives the result > > head(temp3880) > [1] -6.936374 -9.617799 -7.227260 1.135293 8.973817 13.632454 > > > > Can anyone point out to me what is going wrong here? I've tried the first > approach with smaller data frames and vectors and it seems to work fine, > so > I must be making a mistake somewhere... > > Thank you! > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/rowSums-problem-tp4632405.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums problem
Hello, The files you've uploaded are the weights file and the results file, not the original temp.csv. So this is untested but it seems you have a standard matrix multiply problem. temp3880W <- temp[, 3:50] %*% weight3880 Hope this helps, Rui Barradas Em 05-06-2012 15:48, alonis10 escreveu: I'm having a very frustrating problem, trying to find the inverse distance squared weighted interpolants of some weather data. I have a data frame of weights, which sum to 1. I have attached the weights data. I also have a data frame of temperatures at 48 grid points, which I have also attached. Now, all I need to do is multiply all of the rows of the temperature data frame by the weights (element-wise), and sum across the columns. However, when I try to use the most obvious approach, temp3880W<- weight3880*temp[,(3:50)] temp3880W<- rowsum(temp3880W) I get the wrong result: head(temp3880W) 1 2 3 4 5 6 -0.4904454 -1.2728543 -1.5360133 -0.2687030 62.3048012 6.2610305 I've only been successful by using a for loop which is far too slow: temp3880<- rep(0,length(temp$Year)) for (i in 1:length(temp$Year)) { wmul<- weight3880*as.vector(temp[i,(3:50)]) temp3880[i]<- sum(wmul) } This gives the result head(temp3880) [1] -6.936374 -9.617799 -7.227260 1.135293 8.973817 13.632454 Can anyone point out to me what is going wrong here? I've tried the first approach with smaller data frames and vectors and it seems to work fine, so I must be making a mistake somewhere... Thank you! -- View this message in context: http://r.789695.n4.nabble.com/rowSums-problem-tp4632405.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums problem
http://r.789695.n4.nabble.com/file/n4632406/temp3880.csv temp3880.csv http://r.789695.n4.nabble.com/file/n4632406/weight3880.csv weight3880.csv Here are the files I promised to upload. -- View this message in context: http://r.789695.n4.nabble.com/rowSums-problem-tp4632405p4632406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rowSums problem
I'm having a very frustrating problem, trying to find the inverse distance squared weighted interpolants of some weather data. I have a data frame of weights, which sum to 1. I have attached the weights data. I also have a data frame of temperatures at 48 grid points, which I have also attached. Now, all I need to do is multiply all of the rows of the temperature data frame by the weights (element-wise), and sum across the columns. However, when I try to use the most obvious approach, temp3880W <- weight3880*temp[,(3:50)] temp3880W <- rowsum(temp3880W) I get the wrong result: head(temp3880W) 1 2 3 4 5 6 -0.4904454 -1.2728543 -1.5360133 -0.2687030 62.3048012 6.2610305 I've only been successful by using a for loop which is far too slow: temp3880 <- rep(0,length(temp$Year)) for (i in 1:length(temp$Year)) { wmul <- weight3880*as.vector(temp[i,(3:50)]) temp3880[i] <- sum(wmul) } This gives the result head(temp3880) [1] -6.936374 -9.617799 -7.227260 1.135293 8.973817 13.632454 Can anyone point out to me what is going wrong here? I've tried the first approach with smaller data frames and vectors and it seems to work fine, so I must be making a mistake somewhere... Thank you! -- View this message in context: http://r.789695.n4.nabble.com/rowSums-problem-tp4632405.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums - am I getting something wrong?
Hi Thomas, Several of us explained this in different ways just last week, so you might search the archive. Floating point numbers are an approximate representation of real numbers. Things that can be expressed exactly in powers of 10 can't be expressed exactly in powers of 2. So the sum 0.6+0.3+0.1 is NOT clearly 1.0. You can use signif and round to overcome this > a = seq(0,1,0.1) > a [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 > a[7]-0.6 [1] 1.110223e-16 > > 1-(a[4]+a[7]+a[2]) [1] -2.220446e-16 > b = rev(seq(1,0,-0.1)) > b [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 > a-b [1] 0.00e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16 [6] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00 0.00e+00 [11] 0.00e+00 > round(a-b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 > round(a,10)-round(b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 > The first commandment of floating point programming is THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of thomas.salve...@syngenta.com Sent: Monday, March 07, 2011 2:09 AM To: r-help@r-project.org Subject: [R] rowSums - am I getting something wrong? I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)>1) > [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q>1) >[53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q>1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums - am I getting something wrong?
Hi Tom, That's once again the floating point number issue: see FAQ 7.31. Look at this: sum(m[161,]) [1] 1 sum(m[161,])==1 [1] FALSE sum(m[161,])-1 [1] 2.220446e-16 So 0.6+0.3+0.1 is indeed greater than 1 Try this instead: round(sum(m[161,]))==1 [1] TRUE HTH, Ivan Le 3/7/2011 08:08, thomas.salve...@syngenta.com a écrit : I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)>1) [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q>1) [53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q>1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rowSums - am I getting something wrong?
I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)>1) > [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q>1) >[53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q>1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RowSums Question
Thanks Jorge It works great. I solved it by using loop, but i like your way better. Thanks again Cameron -- View this message in context: http://r.789695.n4.nabble.com/RowSums-Question-tp3049261p3049682.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RowSums Question
Hi Cameron, May be this (untested)? rowSums(is.na(tsObj), na.rm = TRUE) HTH, Jorge On Thu, Nov 18, 2010 at 2:38 PM, cameron <> wrote: > > thanks Henrique > > I have another question. > > Lets say i have a timeSeries table >AB C > 1/1/90 NA 1 2 > 1/2/90 NA 1 1 > 1/3/90 NA 1 -1 > 1/4/90 NA -1 1 > 1/5/901 1 1 > 1/6/901 51 1 > > I want to count is.numeric. since NA is not numeric > >A > 1/1/90 2 > 1/2/90 2 > 1/3/90 2 > 1/4/90 2 > 1/5/903 > 1/6/903 > > > Thanks again :) > Cameron > -- > View this message in context: > http://r.789695.n4.nabble.com/RowSums-Question-tp3049261p3049421.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RowSums Question
thanks Henrique I have another question. Lets say i have a timeSeries table AB C 1/1/90 NA 1 2 1/2/90 NA 1 1 1/3/90 NA 1 -1 1/4/90 NA -1 1 1/5/901 1 1 1/6/901 51 1 I want to count is.numeric. since NA is not numeric A 1/1/90 2 1/2/90 2 1/3/90 2 1/4/90 2 1/5/903 1/6/903 Thanks again :) Cameron -- View this message in context: http://r.789695.n4.nabble.com/RowSums-Question-tp3049261p3049421.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RowSums Question
Try this: rowSums(tsObj, na.rm = TRUE) On Thu, Nov 18, 2010 at 3:58 PM, cameron wrote: > > > I have a question on RowSums. > > Lets say i have a timeSeries table >A B C > 1/1/90 NA 1 1 > 1/2/90 NA 1 1 > 1/3/90 NA 1 1 > 1/4/90 NA 1 1 > 1/5/901 1 1 > 1/6/901 1 1 > > if i use RowSums, i will get > > 1/5/903 > 1/6/903 > > > but i want > > 1/1/90 2 > 1/2/90 2 > 1/3/90 2 > 1/4/90 2 > 1/5/90 3 > 1/6/90 3 > > I cant figure out a way without doing loop. > > Thanks > Cameron > > -- > View this message in context: > http://r.789695.n4.nabble.com/RowSums-Question-tp3049261p3049261.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RowSums Question
I have a question on RowSums. Lets say i have a timeSeries table A B C 1/1/90 NA 1 1 1/2/90 NA 1 1 1/3/90 NA 1 1 1/4/90 NA 1 1 1/5/901 1 1 1/6/901 1 1 if i use RowSums, i will get 1/5/903 1/6/903 but i want 1/1/90 2 1/2/90 2 1/3/90 2 1/4/90 2 1/5/90 3 1/6/90 3 I cant figure out a way without doing loop. Thanks Cameron -- View this message in context: http://r.789695.n4.nabble.com/RowSums-Question-tp3049261p3049261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums()
On 9/24/2008 10:38 AM, Marc Schwartz wrote: > on 09/24/2008 09:06 AM Doran, Harold wrote: >> Say I have the following data: >> >> testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) >> >>> testDat >>A B >> 1 1 NA >> 2 NA NA >> 3 3 3 >> >> rowsums() with na.rm=TRUE generates the following, which is not desired: >> >>> rowSums(testDat[, c('A', 'B')], na.rm=T) >> [1] 1 0 6 >> >> rowsums() with na.rm=F generates the following, which is also not >> desired: >> >> >>> rowSums(testDat[, c('A', 'B')], na.rm=F) >> [1] NA NA 6 >> >> I see why this occurs, but what I hope to have returned would be: >> [1] 1 NA 6 >> >> To get what I want I could do the following, but normally my ideas are >> bad ideas and there are codified and proper ways to do things. >> >> rr <- numeric(nrow(testDat)) >> for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else >> sum(testDat[i,], na.rm=T) >> >>> rr >> [1] 1 NA 6 >> >> Is there a "proper" way to do this? In my real data, nrow is over >> 100,000 >> >> Thanks, >> Harold > > The behavior you observe is documented in ?rowSums in the Value section: > > If there are no values in a range to be summed over (after removing > missing values with na.rm = TRUE), that component of the output is set > to 0 (*Sums) or NA (*Means), consistent with sum and mean. Based on the difference in behavior for Sums and Means, this might be another possibility: rowMeans(testDat, na.rm=TRUE) * rowSums(!is.na(testDat)) [1] 1 NA 6 > So: > >> sum(c(NA, NA), na.rm = TRUE) > [1] 0 > > > As per the definition of the sum of an empty set being 0, which I got > burned on myself a while back. > > You could feasibly use: > > Res <- rowSums(testDat, na.rm = TRUE) > is.na(Res) <- rowSums(is.na(testDat)) == ncol(testDat) > > HTH, > > Marc Schwartz > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums()
on 09/24/2008 09:06 AM Doran, Harold wrote: > Say I have the following data: > > testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) > >> testDat >A B > 1 1 NA > 2 NA NA > 3 3 3 > > rowsums() with na.rm=TRUE generates the following, which is not desired: > >> rowSums(testDat[, c('A', 'B')], na.rm=T) > [1] 1 0 6 > > rowsums() with na.rm=F generates the following, which is also not > desired: > > >> rowSums(testDat[, c('A', 'B')], na.rm=F) > [1] NA NA 6 > > I see why this occurs, but what I hope to have returned would be: > [1] 1 NA 6 > > To get what I want I could do the following, but normally my ideas are > bad ideas and there are codified and proper ways to do things. > > rr <- numeric(nrow(testDat)) > for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else > sum(testDat[i,], na.rm=T) > >> rr > [1] 1 NA 6 > > Is there a "proper" way to do this? In my real data, nrow is over > 100,000 > > Thanks, > Harold The behavior you observe is documented in ?rowSums in the Value section: If there are no values in a range to be summed over (after removing missing values with na.rm = TRUE), that component of the output is set to 0 (*Sums) or NA (*Means), consistent with sum and mean. So: > sum(c(NA, NA), na.rm = TRUE) [1] 0 As per the definition of the sum of an empty set being 0, which I got burned on myself a while back. You could feasibly use: Res <- rowSums(testDat, na.rm = TRUE) is.na(Res) <- rowSums(is.na(testDat)) == ncol(testDat) HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums()
try the following: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) ind <- rowSums(is.na(testDat)) == length(testDat) out <- rowSums(testDat, na.rm = TRUE) out[ind] <- NA out I hope it helps. Best, Dimitris Doran, Harold wrote: Say I have the following data: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) testDat A B 1 1 NA 2 NA NA 3 3 3 rowsums() with na.rm=TRUE generates the following, which is not desired: rowSums(testDat[, c('A', 'B')], na.rm=T) [1] 1 0 6 rowsums() with na.rm=F generates the following, which is also not desired: rowSums(testDat[, c('A', 'B')], na.rm=F) [1] NA NA 6 I see why this occurs, but what I hope to have returned would be: [1] 1 NA 6 To get what I want I could do the following, but normally my ideas are bad ideas and there are codified and proper ways to do things. rr <- numeric(nrow(testDat)) for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else sum(testDat[i,], na.rm=T) rr [1] 1 NA 6 Is there a "proper" way to do this? In my real data, nrow is over 100,000 Thanks, Harold sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MiscPsycho_1.2 lattice_0.17-13 statmod_1.3.6 loaded via a namespace (and not attached): [1] grid_2.7.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums()
On 9/24/2008 10:06 AM, Doran, Harold wrote: > Say I have the following data: > > testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) > >> testDat >A B > 1 1 NA > 2 NA NA > 3 3 3 > > rowsums() with na.rm=TRUE generates the following, which is not desired: > >> rowSums(testDat[, c('A', 'B')], na.rm=T) > [1] 1 0 6 > > rowsums() with na.rm=F generates the following, which is also not > desired: > > >> rowSums(testDat[, c('A', 'B')], na.rm=F) > [1] NA NA 6 > > I see why this occurs, but what I hope to have returned would be: > [1] 1 NA 6 > > To get what I want I could do the following, but normally my ideas are > bad ideas and there are codified and proper ways to do things. > > rr <- numeric(nrow(testDat)) > for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else > sum(testDat[i,], na.rm=T) > >> rr > [1] 1 NA 6 > > Is there a "proper" way to do this? In my real data, nrow is over > 100,000 I don't know if it is "proper", but here is a slightly different way that I find easier to read: apply(testDat, 1, function(x){ ifelse(all(is.na(x)), NA, sum(x, na.rm=TRUE)) }) [1] 1 NA 6 hope this helps, Chuck > Thanks, > Harold > >> sessionInfo() > R version 2.7.2 (2008-08-25) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > > other attached packages: > [1] MiscPsycho_1.2 lattice_0.17-13 statmod_1.3.6 > > loaded via a namespace (and not attached): > [1] grid_2.7.2 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums()
I guess this would be the fastest way would be: rs <- rowSums( testDat, na.rm=T) rs[ which( rowMeans(is.na(testDat)) == 1 ) ] <- NA since both rowSums and rowMeans are internally coded in C. Regards, Adai Doran, Harold wrote: Say I have the following data: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) testDat A B 1 1 NA 2 NA NA 3 3 3 rowsums() with na.rm=TRUE generates the following, which is not desired: rowSums(testDat[, c('A', 'B')], na.rm=T) [1] 1 0 6 rowsums() with na.rm=F generates the following, which is also not desired: rowSums(testDat[, c('A', 'B')], na.rm=F) [1] NA NA 6 I see why this occurs, but what I hope to have returned would be: [1] 1 NA 6 To get what I want I could do the following, but normally my ideas are bad ideas and there are codified and proper ways to do things. rr <- numeric(nrow(testDat)) for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else sum(testDat[i,], na.rm=T) rr [1] 1 NA 6 Is there a "proper" way to do this? In my real data, nrow is over 100,000 Thanks, Harold sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MiscPsycho_1.2 lattice_0.17-13 statmod_1.3.6 loaded via a namespace (and not attached): [1] grid_2.7.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rowSums()
Say I have the following data: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) > testDat A B 1 1 NA 2 NA NA 3 3 3 rowsums() with na.rm=TRUE generates the following, which is not desired: > rowSums(testDat[, c('A', 'B')], na.rm=T) [1] 1 0 6 rowsums() with na.rm=F generates the following, which is also not desired: > rowSums(testDat[, c('A', 'B')], na.rm=F) [1] NA NA 6 I see why this occurs, but what I hope to have returned would be: [1] 1 NA 6 To get what I want I could do the following, but normally my ideas are bad ideas and there are codified and proper ways to do things. rr <- numeric(nrow(testDat)) for(i in 1:nrow(testDat)) rr[i] <- if(all(is.na(testDat[i,]))) NA else sum(testDat[i,], na.rm=T) > rr [1] 1 NA 6 Is there a "proper" way to do this? In my real data, nrow is over 100,000 Thanks, Harold > sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MiscPsycho_1.2 lattice_0.17-13 statmod_1.3.6 loaded via a namespace (and not attached): [1] grid_2.7.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
On Wed, 21 Nov 2007, Robin Hankin wrote: > > On 21 Nov 2007, at 08:30, Prof Brian Ripley wrote: > >> On Tue, 20 Nov 2007, Tim Hesterberg wrote: >> >>> I wrote the original rowSums (in S-PLUS). >>> There, rowSums() does not coerce integer to double. >> >> Actaully, neither does R. It computes a double answer but does no coercion >> per se. >> >>> However, one advantage of coercion is to avoid integer overflow. >> >> Indeed, as I told Robin Hankin privately, that was the design reason. >> > > > Brian Ripley also reminded me that the sum() of integers is an integer, > behaviour that I find desirable. > > The reason for my starting this thread is that > sometimes I actually *want* sums of > integers to overflow: my interest is in exact computations > where I must be absolutely certain that there can be no rounding error. > > If the sum cannot be represented > in integers, I want this fact to be flagged with extreme vigour as it signals > what > might be catastrophic loss of precision. Doubles hold integers to a much higher precision (up to 2^53-1), so you can just check if any of the results exceed .Machine$integer.max. It's very unlikely that you are summing enough integers for there to be a possibility of floating-point imprecision in an intermediate sum. > > At least, that's my current thinking. > > > > best wishes > > > rksh > > >>> >>> Tim Hesterberg >>> ... So, why does rowSums() coerce to double (behaviour that is undesirable for me)? >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Brian D. Ripley, [EMAIL PROTECTED] >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, Tel: +44 1865 272861 (self) >> 1 South Parks Road, +44 1865 272866 (PA) >> Oxford OX1 3TG, UKFax: +44 1865 272595 > > -- > Robin Hankin > Uncertainty Analyst > National Oceanography Centre, Southampton > European Way, Southampton SO14 3ZH, UK > tel 023-8059-7743 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
On 21 Nov 2007, at 08:30, Prof Brian Ripley wrote: > On Tue, 20 Nov 2007, Tim Hesterberg wrote: > >> I wrote the original rowSums (in S-PLUS). >> There, rowSums() does not coerce integer to double. > > Actaully, neither does R. It computes a double answer but does no > coercion per se. > >> However, one advantage of coercion is to avoid integer overflow. > > Indeed, as I told Robin Hankin privately, that was the design reason. > Brian Ripley also reminded me that the sum() of integers is an integer, behaviour that I find desirable. The reason for my starting this thread is that sometimes I actually *want* sums of integers to overflow: my interest is in exact computations where I must be absolutely certain that there can be no rounding error. If the sum cannot be represented in integers, I want this fact to be flagged with extreme vigour as it signals what might be catastrophic loss of precision. At least, that's my current thinking. best wishes rksh >> >> Tim Hesterberg >> >>> ... So, why does rowSums() coerce to double (behaviour >>> that is undesirable for me)? >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
On Tue, 20 Nov 2007, Tim Hesterberg wrote: > I wrote the original rowSums (in S-PLUS). > There, rowSums() does not coerce integer to double. Actaully, neither does R. It computes a double answer but does no coercion per se. > However, one advantage of coercion is to avoid integer overflow. Indeed, as I told Robin Hankin privately, that was the design reason. > > Tim Hesterberg > >> ... So, why does rowSums() coerce to double (behaviour >> that is undesirable for me)? > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
I wrote the original rowSums (in S-PLUS). There, rowSums() does not coerce integer to double. However, one advantage of coercion is to avoid integer overflow. Tim Hesterberg >... So, why does rowSums() coerce to double (behaviour >that is undesirable for me)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
On 10 Nov 2007, at 07:32, Prof Brian Ripley wrote: > On Fri, 9 Nov 2007, Robin Hankin wrote: > >> Hi >> >> [R-2.6.0, macOSX 10.4.10]. >> >> The helppage says that rowSums() and colSums() >> are equivalent to 'apply' with 'FUN = sum'. >> >> But I came across this: >> >> > a <- matrix(1:30,5,6) >> > is.integer(apply(a,1,sum)) >> [1] TRUE >> > is.integer(rowSums(a)) >> [1] FALSE >> > > > 'equivalent' does not mean 'identical': the wording was deliberate. > >> so rowSums() returns a float. > > And that is what the help page says it does (albeit more > accurately: there is no 'float' type, but there is numeric aka > double and the result could be complex). > >> Why is this? > > You seem to be asking why R works as documented! > Yes, that's exactly what I was asking [perhaps this should have been R-devel?]. What is the thinking behind converting to double? I expect that part of the answer is speed: # First define an integer matrix: a <- matrix(as.integer(rpois(1e6,3)),1000,1000) > system.time(rowSums(a)) user system elapsed 0.049 0.000 0.050 > system.time(rowSums(a)) user system elapsed 0.050 0.000 0.051 > system.time(rowSums(a)) user system elapsed 0.050 0.001 0.052 > system.time(colSums(a)) user system elapsed 0.043 0.001 0.046 > system.time(colSums(a)) user system elapsed 0.043 0.000 0.044 About the same speed. Now use apply() to see whether integer summation is faster than double summation for this kind of problem: > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.085 0.009 0.094 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.086 0.010 0.095 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.089 0.010 0.104 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.071 0.008 0.078 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.069 0.007 0.076 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.070 0.008 0.081 # Now convert to double: > a <- a+0 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.127 0.019 0.151 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.121 0.017 0.139 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.130 0.022 0.175 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.084 0.015 0.098 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.085 0.015 0.105 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.087 0.016 0.107 [can anyone comment on the difference between the first three and the last three double precision summations?] perhaps a little bit faster for the integers, but there's not much in it. So, why does rowSums() coerce to double (behaviour that is undesirable for me)? -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums() and is.integer()
On Fri, 9 Nov 2007, Robin Hankin wrote: > Hi > > [R-2.6.0, macOSX 10.4.10]. > > The helppage says that rowSums() and colSums() > are equivalent to 'apply' with 'FUN = sum'. > > But I came across this: > > > a <- matrix(1:30,5,6) > > is.integer(apply(a,1,sum)) > [1] TRUE > > is.integer(rowSums(a)) > [1] FALSE > > 'equivalent' does not mean 'identical': the wording was deliberate. > so rowSums() returns a float. And that is what the help page says it does (albeit more accurately: there is no 'float' type, but there is numeric aka double and the result could be complex). > Why is this? You seem to be asking why R works as documented! -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rowSums() and is.integer()
Hi [R-2.6.0, macOSX 10.4.10]. The helppage says that rowSums() and colSums() are equivalent to 'apply' with 'FUN = sum'. But I came across this: > a <- matrix(1:30,5,6) > is.integer(apply(a,1,sum)) [1] TRUE > is.integer(rowSums(a)) [1] FALSE > so rowSums() returns a float. Why is this? -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.