Re: [R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread Ebert,Timothy Aaron
Would this work?

xxxz$Average20 <- (xxxz$Low20 + xxxz$High20)/2

I tried this earlier but it does not appear to have gone through.

Tim

-Original Message-
From: R-help  On Behalf Of avi.e.gr...@gmail.com
Sent: Saturday, June 8, 2024 2:16 PM
To: 'Sorkin, John' ; r-help@r-project.org
Subject: Re: [R] Can't compute row means of two columns of a dataframe.

[External Email]

John,

Maybe you can clarify what you want the output to look like. It took me a while 
to realize what you may want as it is NOT properly described as wanting rowsums.

There is a standard function called rowMeans() that probably does what you want 
if you want the mean of all rows as in:

> rowMeans(xxxz)
 [1]  84.3  87.0  89.7  92.3  95.0  97.7 100.3
103.7 106.3 109.0 112.3 115.0 [13] 118.0 121.3 
124.0 127.3 130.7 134.0 137.0

It does not add the means to the original data.frame if you wanted it there but 
that is easy enough to do.

> xxxz$Average20 <-rowMeans(xxxz)
> head(xxxz)
  TotalInches Low20 High20 Average20
1  5884111  84.3
2  5987115  87.0
3  6090119  89.7
4  6193123  92.3
5  6296127  95.0
6  6399131  97.7

Your construct is more complex and it looks like you want to do this to a 
subset of two columns. Again, straightforward:

xxxz$Average20 <-rowMeans(xxxz[, c("Low20", "High20")])

And I probably would do this using a dplyr mutate but that is outside the scope.

This does not help explain your error, so let me look at what you are trying to 
do.


What  did you expect to use by() for in the second argument? You seem to be 
giving it INDICES of the first column entries. What is that for?

by(xxxz[,c("Low20","High20")],
   xxxz[,"TotalInches"],
   mean)

The documentation suggest this is for splitting by factors. I do not  see there 
are multiple instances of some TotalInches so why is this needed for some kind 
of grouping?

My guess is you are using the wrong function or the wrong way for your needs. 
The warnings may relate to that.


-Original Message-
From: R-help  On Behalf Of Sorkin, John
Sent: Saturday, June 8, 2024 1:38 PM
To: r-help@r-project.org (r-help@r-project.org) 
Subject: [R] Can't compute row means of two columns of a dataframe.

I have a data frame with three columns, TotalInches, Low20, High20. For each 
row of the dataset, I am trying to compute the mean of Low20 and High20.

xxxz <- structure(list(TotalInches =
 c(58, 59, 60, 61, 62, 63, 64, 65,
   66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 
87,
   90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129,
   133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131,
   135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 
191
   )), class = "data.frame", row.names = c(NA, -19L)) xxxz
str(xxxz)
xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
warnings()

When I run the code above, I don't get the means by row. I get the following 
warning messages, one for each row of the dataframe.

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 Can someone tell my what I am doing wrong, and how I can compute the row means?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine; Associate 
Director for Biostatistics and Informatics, Baltimore VA Medical Center 
Geriatrics Research, Education, and Clinical Center; PI Biostatistics and 
Informatics Core, University of Maryland School of Medicine Claude D. Pepper 
Older Americans Independence Center; Senior Statistician University of Maryland 
Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCR

Re: [R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread Ebert,Timothy Aaron
Can this problem be made more direct?

xxxz$Average.20 <- (xxxz$Low20 + xxxz$High20)/2

That is literally the mean of two columns. Functions can be useful if there 
will be more columns, but with just two this seems easier.

I will point out that the average daily temperature based on the midpoint 
between minimum and maximum contains a fair bit of error because that is only 
roughly how heating and cooling respond. I admit that sometimes there are no 
other choices and we work with available data.

Tim


-Original Message-
From: R-help  On Behalf Of Sorkin, John
Sent: Saturday, June 8, 2024 1:38 PM
To: r-help@r-project.org (r-help@r-project.org) 
Subject: [R] Can't compute row means of two columns of a dataframe.

[External Email]

I have a data frame with three columns, TotalInches, Low20, High20. For each 
row of the dataset, I am trying to compute the mean of Low20 and High20.

xxxz <- structure(list(TotalInches =
 c(58, 59, 60, 61, 62, 63, 64, 65,
   66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 
87,
   90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129,
   133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131,
   135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 
191
   )), class = "data.frame", row.names = c(NA, -19L)) xxxz
str(xxxz)
xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
warnings()

When I run the code above, I don't get the means by row. I get the following 
warning messages, one for each row of the dataframe.

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 Can someone tell my what I am doing wrong, and how I can compute the row means?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine; Associate 
Director for Biostatistics and Informatics, Baltimore VA Medical Center 
Geriatrics Research, Education, and Clinical Center; PI Biostatistics and 
Informatics Core, University of Maryland School of Medicine Claude D. Pepper 
Older Americans Independence Center; Senior Statistician University of Maryland 
Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread avi.e.gross
John,

Maybe you can clarify what you want the output to look like. It took me a
while to realize what you may want as it is NOT properly described as
wanting rowsums.

There is a standard function called rowMeans() that probably does what you
want if you want the mean of all rows as in:

> rowMeans(xxxz)
 [1]  84.3  87.0  89.7  92.3  95.0  97.7 100.3
103.7 106.3 109.0 112.3 115.0
[13] 118.0 121.3 124.0 127.3 130.7 134.0 137.0

It does not add the means to the original data.frame if you wanted it there
but that is easy enough to do.

> xxxz$Average20 <-rowMeans(xxxz)
> head(xxxz)
  TotalInches Low20 High20 Average20
1  5884111  84.3
2  5987115  87.0
3  6090119  89.7
4  6193123  92.3
5  6296127  95.0
6  6399131  97.7

Your construct is more complex and it looks like you want to do this to a
subset of two columns. Again, straightforward:

xxxz$Average20 <-rowMeans(xxxz[, c("Low20", "High20")])

And I probably would do this using a dplyr mutate but that is outside the
scope.

This does not help explain your error, so let me look at what you are trying
to do.


What  did you expect to use by() for in the second argument? You seem to be
giving it INDICES of the first column entries. What is that for?

by(xxxz[,c("Low20","High20")],
   xxxz[,"TotalInches"],
   mean)

The documentation suggest this is for splitting by factors. I do not  see
there are multiple instances of some TotalInches so why is this needed for
some kind of grouping?

My guess is you are using the wrong function or the wrong way for your
needs. The warnings may relate to that.


-Original Message-
From: R-help  On Behalf Of Sorkin, John
Sent: Saturday, June 8, 2024 1:38 PM
To: r-help@r-project.org (r-help@r-project.org) 
Subject: [R] Can't compute row means of two columns of a dataframe.

I have a data frame with three columns, TotalInches, Low20, High20. For each
row of the dataset, I am trying to compute the mean of Low20 and High20. 

xxxz <- structure(list(TotalInches = 
 c(58, 59, 60, 61, 62, 63, 64, 65, 
   66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 =
c(84, 87, 
   90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126,
129, 
   133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127,
131, 
   135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181,
186, 191
   )), class = "data.frame", row.names = c(NA, -19L))
xxxz
str(xxxz)
xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
warnings()

When I run the code above, I don't get the means by row. I get the following
warning messages, one for each row of the dataframe.

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 Can someone tell my what I am doing wrong, and how I can compute the row
means?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread Bert Gunter
Incidentally, FWIW, for means, rowMeans() is a lot faster:

xxxz$av20 <- rowMeans(xxxz[,c("Low20","High20")])

Bert



On Sat, Jun 8, 2024 at 10:47 AM Bert Gunter  wrote:

> Use apply(), not by().
>
> xxxz$av20 <- apply(xxxz[,c("Low20","High20")],1, mean)
>
> -- Bert
>
> On Sat, Jun 8, 2024 at 10:38 AM Sorkin, John 
> wrote:
>
>> I have a data frame with three columns, TotalInches, Low20, High20. For
>> each row of the dataset, I am trying to compute the mean of Low20 and
>> High20.
>>
>> xxxz <- structure(list(TotalInches =
>>  c(58, 59, 60, 61, 62, 63, 64, 65,
>>66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 =
>> c(84, 87,
>>90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122,
>> 126, 129,
>>133, 137, 141, 144), High20 = c(111, 115, 119, 123,
>> 127, 131,
>>135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181,
>> 186, 191
>>)), class = "data.frame", row.names = c(NA, -19L))
>> xxxz
>> str(xxxz)
>> xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
>> warnings()
>>
>> When I run the code above, I don't get the means by row. I get the
>> following warning messages, one for each row of the dataframe.
>>
>> Warning messages:
>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>>
>>  Can someone tell my what I am doing wrong, and how I can compute the row
>> means?
>>
>> Thank you,
>> John
>>
>> John David Sorkin M.D., Ph.D.
>> Professor of Medicine, University of Maryland School of Medicine;
>> Associate Director for Biostatistics and Informatics, Baltimore VA
>> Medical Center Geriatrics Research, Education, and Clinical Center;
>> PI Biostatistics and Informatics Core, University of Maryland School of
>> Medicine Claude D. Pepper Older Americans Independence Center;
>> Senior Statistician University of Maryland Center for Vascular Research;
>>
>> Division of Gerontology and Paliative Care,
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> Cell phone 443-418-5382
>>
>>
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread Bert Gunter
Use apply(), not by().

xxxz$av20 <- apply(xxxz[,c("Low20","High20")],1, mean)

-- Bert

On Sat, Jun 8, 2024 at 10:38 AM Sorkin, John 
wrote:

> I have a data frame with three columns, TotalInches, Low20, High20. For
> each row of the dataset, I am trying to compute the mean of Low20 and
> High20.
>
> xxxz <- structure(list(TotalInches =
>  c(58, 59, 60, 61, 62, 63, 64, 65,
>66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 =
> c(84, 87,
>90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126,
> 129,
>133, 137, 141, 144), High20 = c(111, 115, 119, 123,
> 127, 131,
>135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181,
> 186, 191
>)), class = "data.frame", row.names = c(NA, -19L))
> xxxz
> str(xxxz)
> xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
> warnings()
>
> When I run the code above, I don't get the means by row. I get the
> following warning messages, one for each row of the dataframe.
>
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
>
>  Can someone tell my what I am doing wrong, and how I can compute the row
> means?
>
> Thank you,
> John
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine, University of Maryland School of Medicine;
> Associate Director for Biostatistics and Informatics, Baltimore VA Medical
> Center Geriatrics Research, Education, and Clinical Center;
> PI Biostatistics and Informatics Core, University of Maryland School of
> Medicine Claude D. Pepper Older Americans Independence Center;
> Senior Statistician University of Maryland Center for Vascular Research;
>
> Division of Gerontology and Paliative Care,
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> Cell phone 443-418-5382
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.