Re: [R] Can't compute row means of two columns of a dataframe.
Would this work? xxxz$Average20 <- (xxxz$Low20 + xxxz$High20)/2 I tried this earlier but it does not appear to have gone through. Tim -Original Message- From: R-help On Behalf Of avi.e.gr...@gmail.com Sent: Saturday, June 8, 2024 2:16 PM To: 'Sorkin, John' ; r-help@r-project.org Subject: Re: [R] Can't compute row means of two columns of a dataframe. [External Email] John, Maybe you can clarify what you want the output to look like. It took me a while to realize what you may want as it is NOT properly described as wanting rowsums. There is a standard function called rowMeans() that probably does what you want if you want the mean of all rows as in: > rowMeans(xxxz) [1] 84.3 87.0 89.7 92.3 95.0 97.7 100.3 103.7 106.3 109.0 112.3 115.0 [13] 118.0 121.3 124.0 127.3 130.7 134.0 137.0 It does not add the means to the original data.frame if you wanted it there but that is easy enough to do. > xxxz$Average20 <-rowMeans(xxxz) > head(xxxz) TotalInches Low20 High20 Average20 1 5884111 84.3 2 5987115 87.0 3 6090119 89.7 4 6193123 92.3 5 6296127 95.0 6 6399131 97.7 Your construct is more complex and it looks like you want to do this to a subset of two columns. Again, straightforward: xxxz$Average20 <-rowMeans(xxxz[, c("Low20", "High20")]) And I probably would do this using a dplyr mutate but that is outside the scope. This does not help explain your error, so let me look at what you are trying to do. What did you expect to use by() for in the second argument? You seem to be giving it INDICES of the first column entries. What is that for? by(xxxz[,c("Low20","High20")], xxxz[,"TotalInches"], mean) The documentation suggest this is for splitting by factors. I do not see there are multiple instances of some TotalInches so why is this needed for some kind of grouping? My guess is you are using the wrong function or the wrong way for your needs. The warnings may relate to that. -Original Message- From: R-help On Behalf Of Sorkin, John Sent: Saturday, June 8, 2024 1:38 PM To: r-help@r-project.org (r-help@r-project.org) Subject: [R] Can't compute row means of two columns of a dataframe. I have a data frame with three columns, TotalInches, Low20, High20. For each row of the dataset, I am trying to compute the mean of Low20 and High20. xxxz <- structure(list(TotalInches = c(58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 87, 90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129, 133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131, 135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 191 )), class = "data.frame", row.names = c(NA, -19L)) xxxz str(xxxz) xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) warnings() When I run the code above, I don't get the means by row. I get the following warning messages, one for each row of the dataframe. Warning messages: 1: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA 2: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA Can someone tell my what I am doing wrong, and how I can compute the row means? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCR
Re: [R] Can't compute row means of two columns of a dataframe.
Can this problem be made more direct? xxxz$Average.20 <- (xxxz$Low20 + xxxz$High20)/2 That is literally the mean of two columns. Functions can be useful if there will be more columns, but with just two this seems easier. I will point out that the average daily temperature based on the midpoint between minimum and maximum contains a fair bit of error because that is only roughly how heating and cooling respond. I admit that sometimes there are no other choices and we work with available data. Tim -Original Message- From: R-help On Behalf Of Sorkin, John Sent: Saturday, June 8, 2024 1:38 PM To: r-help@r-project.org (r-help@r-project.org) Subject: [R] Can't compute row means of two columns of a dataframe. [External Email] I have a data frame with three columns, TotalInches, Low20, High20. For each row of the dataset, I am trying to compute the mean of Low20 and High20. xxxz <- structure(list(TotalInches = c(58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 87, 90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129, 133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131, 135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 191 )), class = "data.frame", row.names = c(NA, -19L)) xxxz str(xxxz) xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) warnings() When I run the code above, I don't get the means by row. I get the following warning messages, one for each row of the dataframe. Warning messages: 1: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA 2: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA Can someone tell my what I am doing wrong, and how I can compute the row means? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't compute row means of two columns of a dataframe.
John, Maybe you can clarify what you want the output to look like. It took me a while to realize what you may want as it is NOT properly described as wanting rowsums. There is a standard function called rowMeans() that probably does what you want if you want the mean of all rows as in: > rowMeans(xxxz) [1] 84.3 87.0 89.7 92.3 95.0 97.7 100.3 103.7 106.3 109.0 112.3 115.0 [13] 118.0 121.3 124.0 127.3 130.7 134.0 137.0 It does not add the means to the original data.frame if you wanted it there but that is easy enough to do. > xxxz$Average20 <-rowMeans(xxxz) > head(xxxz) TotalInches Low20 High20 Average20 1 5884111 84.3 2 5987115 87.0 3 6090119 89.7 4 6193123 92.3 5 6296127 95.0 6 6399131 97.7 Your construct is more complex and it looks like you want to do this to a subset of two columns. Again, straightforward: xxxz$Average20 <-rowMeans(xxxz[, c("Low20", "High20")]) And I probably would do this using a dplyr mutate but that is outside the scope. This does not help explain your error, so let me look at what you are trying to do. What did you expect to use by() for in the second argument? You seem to be giving it INDICES of the first column entries. What is that for? by(xxxz[,c("Low20","High20")], xxxz[,"TotalInches"], mean) The documentation suggest this is for splitting by factors. I do not see there are multiple instances of some TotalInches so why is this needed for some kind of grouping? My guess is you are using the wrong function or the wrong way for your needs. The warnings may relate to that. -Original Message- From: R-help On Behalf Of Sorkin, John Sent: Saturday, June 8, 2024 1:38 PM To: r-help@r-project.org (r-help@r-project.org) Subject: [R] Can't compute row means of two columns of a dataframe. I have a data frame with three columns, TotalInches, Low20, High20. For each row of the dataset, I am trying to compute the mean of Low20 and High20. xxxz <- structure(list(TotalInches = c(58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 87, 90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129, 133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131, 135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 191 )), class = "data.frame", row.names = c(NA, -19L)) xxxz str(xxxz) xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) warnings() When I run the code above, I don't get the means by row. I get the following warning messages, one for each row of the dataframe. Warning messages: 1: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA 2: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA Can someone tell my what I am doing wrong, and how I can compute the row means? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't compute row means of two columns of a dataframe.
Incidentally, FWIW, for means, rowMeans() is a lot faster: xxxz$av20 <- rowMeans(xxxz[,c("Low20","High20")]) Bert On Sat, Jun 8, 2024 at 10:47 AM Bert Gunter wrote: > Use apply(), not by(). > > xxxz$av20 <- apply(xxxz[,c("Low20","High20")],1, mean) > > -- Bert > > On Sat, Jun 8, 2024 at 10:38 AM Sorkin, John > wrote: > >> I have a data frame with three columns, TotalInches, Low20, High20. For >> each row of the dataset, I am trying to compute the mean of Low20 and >> High20. >> >> xxxz <- structure(list(TotalInches = >> c(58, 59, 60, 61, 62, 63, 64, 65, >>66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = >> c(84, 87, >>90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, >> 126, 129, >>133, 137, 141, 144), High20 = c(111, 115, 119, 123, >> 127, 131, >>135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, >> 186, 191 >>)), class = "data.frame", row.names = c(NA, -19L)) >> xxxz >> str(xxxz) >> xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) >> warnings() >> >> When I run the code above, I don't get the means by row. I get the >> following warning messages, one for each row of the dataframe. >> >> Warning messages: >> 1: In mean.default(data[x, , drop = FALSE], ...) : >> argument is not numeric or logical: returning NA >> 2: In mean.default(data[x, , drop = FALSE], ...) : >> argument is not numeric or logical: returning NA >> >> Can someone tell my what I am doing wrong, and how I can compute the row >> means? >> >> Thank you, >> John >> >> John David Sorkin M.D., Ph.D. >> Professor of Medicine, University of Maryland School of Medicine; >> Associate Director for Biostatistics and Informatics, Baltimore VA >> Medical Center Geriatrics Research, Education, and Clinical Center; >> PI Biostatistics and Informatics Core, University of Maryland School of >> Medicine Claude D. Pepper Older Americans Independence Center; >> Senior Statistician University of Maryland Center for Vascular Research; >> >> Division of Gerontology and Paliative Care, >> 10 North Greene Street >> GRECC (BT/18/GR) >> Baltimore, MD 21201-1524 >> Cell phone 443-418-5382 >> >> >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't compute row means of two columns of a dataframe.
Use apply(), not by(). xxxz$av20 <- apply(xxxz[,c("Low20","High20")],1, mean) -- Bert On Sat, Jun 8, 2024 at 10:38 AM Sorkin, John wrote: > I have a data frame with three columns, TotalInches, Low20, High20. For > each row of the dataset, I am trying to compute the mean of Low20 and > High20. > > xxxz <- structure(list(TotalInches = > c(58, 59, 60, 61, 62, 63, 64, 65, >66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = > c(84, 87, >90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, > 129, >133, 137, 141, 144), High20 = c(111, 115, 119, 123, > 127, 131, >135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, > 186, 191 >)), class = "data.frame", row.names = c(NA, -19L)) > xxxz > str(xxxz) > xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) > warnings() > > When I run the code above, I don't get the means by row. I get the > following warning messages, one for each row of the dataframe. > > Warning messages: > 1: In mean.default(data[x, , drop = FALSE], ...) : > argument is not numeric or logical: returning NA > 2: In mean.default(data[x, , drop = FALSE], ...) : > argument is not numeric or logical: returning NA > > Can someone tell my what I am doing wrong, and how I can compute the row > means? > > Thank you, > John > > John David Sorkin M.D., Ph.D. > Professor of Medicine, University of Maryland School of Medicine; > Associate Director for Biostatistics and Informatics, Baltimore VA Medical > Center Geriatrics Research, Education, and Clinical Center; > PI Biostatistics and Informatics Core, University of Maryland School of > Medicine Claude D. Pepper Older Americans Independence Center; > Senior Statistician University of Maryland Center for Vascular Research; > > Division of Gerontology and Paliative Care, > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > Cell phone 443-418-5382 > > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.