Re: [R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.

2015-12-28 Thread John Sorkin
Thank you,
John



John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 
>>> William Dunlap  12/28/15 12:55 AM >>>

by(dataFrame, groupId, FUN) applies FUN a bunch of data.frames (row subsetsof 
the dataFrame input).  mean() returns NA for data.frames.  You could use
FUN=colMeans if you wanted column means or FUN=function(x)mean(colMeans(x))
or FUN=function(x)mean(unlist(x)) if you wanted some version of a grand mean
over all the columns.

If you want column means, you may find aggregate() more suited to the job, as it
applies FUN to each column in each row subset of the data and returns a 
data.frame
instead of a list of outputs of FUN.
  > aggregate(mtcars[,3:5], mtcars[,2,drop=FALSE], mean)
cyl disphp drat
  1   4 105.1364  82.63636 4.070909
  2   6 183.3143 122.28571 3.585714
  3   8 353.1000 209.21429 3.229286







Bill Dunlap
TIBCO Software
wdunlap tibco.com



On Sun, Dec 27, 2015 at 6:55 PM, John Sorkin  
wrote:

When I run by, I get an error message and no results. Any help in understanding 
what is wrong would be appreciated.

Error message:
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA


Results:
Arm: MUFA
[1] NA
---
Arm: PUFA
[1] NA

Code:
by(hold,Arm,mean,na.rm=TRUE)

I don't understand why I am getting the error message, and why I am not getting 
any results. I don't believe my data are non-numeric.

BY str works fine and confirms that the data are numeric
> by(hold,Arm,str)
'data.frame':   23 obs. of  3 variables:
 $ Wtscr: num  97.2 103.9 58.2 130.9 135 ...
 $ Wt0  : num  96.2 106.1 56.7 127.4 133.1 ...
 $ Wt6  : num  93.8 101.7 55.5 127.6 130.9 ...
'data.frame':   16 obs. of  3 variables:
 $ Wtscr: num  120.2 104.6 100.1 74.8 112.6 ...
 $ Wt0  : num  117.2 105.3 99.5 75.7 110.7 ...
 $ Wt6  : num  114.6 104.8 84.5 77.7 107.4 ...
 Here is a listing of my data:
> hold
   Wtscr   Wt0Wt6
1  120.2 117.2 114.60
2  104.6 105.3 104.80
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60
7  135.0 133.1 130.90
8  100.1  99.5  84.50
9  130.3 115.3 115.80
10 150.5 148.7 133.40
11  74.8  75.7  77.70
12 112.6 110.7 107.40
13  90.0  91.0  83.40
14 139.1 138.5 126.70
15  99.1  96.4  95.70
16 108.3 107.5 109.30
17  75.1  72.9  72.20
18  97.5 102.1  98.50
19 202.2  90.1  90.60
20  91.7  89.4  93.40
21 102.1 102.2 100.80
22 116.9 118.9 118.00
23  94.6  95.3  90.30
24 122.2 117.0 117.00
25 105.6 103.3 103.60
26  96.9  96.8  98.80
27 102.9 100.3  89.00
28 115.8 118.5 117.30
29  95.7  96.2  95.40
30  88.2  86.9  88.30
31 108.7 108.8 108.80
32  89.2  88.6  81.20
33  86.8  86.5  82.70
34 135.5 130.1 125.40
35 112.5 113.9 111.45
36 111.0 105.3 109.50
37 103.4 100.5  95.50
38 117.6 117.4 101.40
39 116.7 118.5 101.80

The INDEX is clearly a factor:
> Arm
 [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA MUFA 
PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA MUFA PUFA
[32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
Levels: MUFA PUFA

The data and the index have the same length:
> cbind(hold,Arm)
   Wtscr   Wt0Wt6  Arm
1  120.2 117.2 114.60 PUFA
2  104.6 105.3 104.80 PUFA
3   97.2  96.2  93.80 MUFA
4  103.9 106.1 101.70 MUFA
5   58.2  56.7  55.50 MUFA
6  130.9 127.4 127.60 MUFA
7  135.0 133.1 130.90 MUFA
8  100.1  99.5  84.50 PUFA
9  130.3 115.3 115.80 MUFA
10 150.5 148.7 133.40 MUFA
11  74.8  75.7  77.70 PUFA
12 112.6 110.7 107.40 PUFA
13  90.0  91.0  83.40 PUFA
14 139.1 138.5 126.70 MUFA
15  99.1  96.4  95.70 MUFA
16 108.3 107.5 109.30 PUFA
17  75.1  72.9  72.20 PUFA
18  97.5 102.1  98.50 PUFA
19 202.2  90.1  90.60 MUFA
20  91.7  89.4  93.40 MUFA
21 102.1 102.2 100.80 MUFA
22 116.9 118.9 118.00 MUFA
23  94.6  95.3  90.30 MUFA
24 122.2 117.0 117.00 PUFA
25 105.6 103.3 103.60 MUFA
26  96.9  96.8  98.80 MUFA
27 102.9 100.3  89.00 PUFA
28 115.8 118.5 117.30 MUFA
29  95.7  96.2  95.40 PUFA
30  88.2  86.9  88.30 MUFA
31 108.7 108.8 108.80 PUFA
32  89.2  88.6  81.20 MUFA
33  86.8  86.5  82.70 MUFA
34 135.5 130.1 125.40 MUFA
35 112.5 113.9 111.45 MUFA
36 111.0 105.3 109.50 MUFA
37 103.4 100.5  95.50 PUFA
38 117.6 117.4 101.40 PUFA
39 116.7 118.5 101.80 PUFA

But the by function does not work!
> by(hold,Arm,mean,na.rm=TRUE)
Arm: MUFA
[1] NA
---
Arm: PUFA
[1] NA
Warning 

Re: [R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.

2015-12-27 Thread Rolf Turner


You are trying to take the mean of data frames.  There is no 
"data.frame" method for mean().


Try:

by(hold,Arm,function(x){sapply(x,mean)})


BTW what's the point of "na.rm=TRUE" in your call?  There are no missing 
values in the data that you present.


In future, please use dput() to present your data; it makes life a lot 
easier for respondents.


cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 28/12/15 15:55, John Sorkin wrote:

When I run by, I get an error message and no results. Any help in understanding 
what is wrong would be appreciated.

Error message:
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA


Results:
Arm: MUFA
[1] NA
---
Arm: PUFA
[1] NA

Code:
by(hold,Arm,mean,na.rm=TRUE)

I don't understand why I am getting the error message, and why I am not getting 
any results. I don't believe my data are non-numeric.

BY str works fine and confirms that the data are numeric

by(hold,Arm,str)

'data.frame':   23 obs. of  3 variables:
  $ Wtscr: num  97.2 103.9 58.2 130.9 135 ...
  $ Wt0  : num  96.2 106.1 56.7 127.4 133.1 ...
  $ Wt6  : num  93.8 101.7 55.5 127.6 130.9 ...
'data.frame':   16 obs. of  3 variables:
  $ Wtscr: num  120.2 104.6 100.1 74.8 112.6 ...
  $ Wt0  : num  117.2 105.3 99.5 75.7 110.7 ...
  $ Wt6  : num  114.6 104.8 84.5 77.7 107.4 ...
  Here is a listing of my data:

hold

Wtscr   Wt0Wt6
1  120.2 117.2 114.60
2  104.6 105.3 104.80
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60
7  135.0 133.1 130.90
8  100.1  99.5  84.50
9  130.3 115.3 115.80
10 150.5 148.7 133.40
11  74.8  75.7  77.70
12 112.6 110.7 107.40
13  90.0  91.0  83.40
14 139.1 138.5 126.70
15  99.1  96.4  95.70
16 108.3 107.5 109.30
17  75.1  72.9  72.20
18  97.5 102.1  98.50
19 202.2  90.1  90.60
20  91.7  89.4  93.40
21 102.1 102.2 100.80
22 116.9 118.9 118.00
23  94.6  95.3  90.30
24 122.2 117.0 117.00
25 105.6 103.3 103.60
26  96.9  96.8  98.80
27 102.9 100.3  89.00
28 115.8 118.5 117.30
29  95.7  96.2  95.40
30  88.2  86.9  88.30
31 108.7 108.8 108.80
32  89.2  88.6  81.20
33  86.8  86.5  82.70
34 135.5 130.1 125.40
35 112.5 113.9 111.45
36 111.0 105.3 109.50
37 103.4 100.5  95.50
38 117.6 117.4 101.40
39 116.7 118.5 101.80

The INDEX is clearly a factor:

Arm

  [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA 
MUFA PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA MUFA 
PUFA
[32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
Levels: MUFA PUFA

The data and the index have the same length:

cbind(hold,Arm)

Wtscr   Wt0Wt6  Arm
1  120.2 117.2 114.60 PUFA
2  104.6 105.3 104.80 PUFA
3   97.2  96.2  93.80 MUFA
4  103.9 106.1 101.70 MUFA
5   58.2  56.7  55.50 MUFA
6  130.9 127.4 127.60 MUFA
7  135.0 133.1 130.90 MUFA
8  100.1  99.5  84.50 PUFA
9  130.3 115.3 115.80 MUFA
10 150.5 148.7 133.40 MUFA
11  74.8  75.7  77.70 PUFA
12 112.6 110.7 107.40 PUFA
13  90.0  91.0  83.40 PUFA
14 139.1 138.5 126.70 MUFA
15  99.1  96.4  95.70 MUFA
16 108.3 107.5 109.30 PUFA
17  75.1  72.9  72.20 PUFA
18  97.5 102.1  98.50 PUFA
19 202.2  90.1  90.60 MUFA
20  91.7  89.4  93.40 MUFA
21 102.1 102.2 100.80 MUFA
22 116.9 118.9 118.00 MUFA
23  94.6  95.3  90.30 MUFA
24 122.2 117.0 117.00 PUFA
25 105.6 103.3 103.60 MUFA
26  96.9  96.8  98.80 MUFA
27 102.9 100.3  89.00 PUFA
28 115.8 118.5 117.30 MUFA
29  95.7  96.2  95.40 PUFA
30  88.2  86.9  88.30 MUFA
31 108.7 108.8 108.80 PUFA
32  89.2  88.6  81.20 MUFA
33  86.8  86.5  82.70 MUFA
34 135.5 130.1 125.40 MUFA
35 112.5 113.9 111.45 MUFA
36 111.0 105.3 109.50 MUFA
37 103.4 100.5  95.50 PUFA
38 117.6 117.4 101.40 PUFA
39 116.7 118.5 101.80 PUFA

But the by function does not work!

by(hold,Arm,mean,na.rm=TRUE)

Arm: MUFA
[1] NA
---
Arm: PUFA
[1] NA
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA


Perhaps this is a hint, print does not give two separate group:

by(hold,Arm,print)

Wtscr   Wt0Wt6
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60
7  135.0 133.1 130.90
9  130.3 115.3 115.80
10 150.5 148.7 133.40
14 139.1 138.5 126.70
15  99.1  96.4  95.70
19 202.2  90.1  90.60
20  91.7  89.4  93.40
21 102.1 102.2 100.80
22 116.9 118.9 118.00
23  94.6  95.3  90.30
25 105.6 103.3 103.60
26  96.9  96.8  98.80
28 115.8 118.5 117.30
30  88.2  86.9  88.30
32  89.2  88.6  81.20
33  86.8  86.5  82.70
34 135.5 130.1 

[R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.

2015-12-27 Thread John Sorkin
When I run by, I get an error message and no results. Any help in understanding 
what is wrong would be appreciated.
 
Error message:
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 
Results:
Arm: MUFA
[1] NA
---
 
Arm: PUFA
[1] NA

Code:
by(hold,Arm,mean,na.rm=TRUE)

I don't understand why I am getting the error message, and why I am not getting 
any results. I don't believe my data are non-numeric. 

BY str works fine and confirms that the data are numeric
> by(hold,Arm,str)
'data.frame':   23 obs. of  3 variables:
 $ Wtscr: num  97.2 103.9 58.2 130.9 135 ...
 $ Wt0  : num  96.2 106.1 56.7 127.4 133.1 ...
 $ Wt6  : num  93.8 101.7 55.5 127.6 130.9 ...
'data.frame':   16 obs. of  3 variables:
 $ Wtscr: num  120.2 104.6 100.1 74.8 112.6 ...
 $ Wt0  : num  117.2 105.3 99.5 75.7 110.7 ...
 $ Wt6  : num  114.6 104.8 84.5 77.7 107.4 ...
 Here is a listing of my data:
> hold
   Wtscr   Wt0Wt6
1  120.2 117.2 114.60
2  104.6 105.3 104.80
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60
7  135.0 133.1 130.90
8  100.1  99.5  84.50
9  130.3 115.3 115.80
10 150.5 148.7 133.40
11  74.8  75.7  77.70
12 112.6 110.7 107.40
13  90.0  91.0  83.40
14 139.1 138.5 126.70
15  99.1  96.4  95.70
16 108.3 107.5 109.30
17  75.1  72.9  72.20
18  97.5 102.1  98.50
19 202.2  90.1  90.60
20  91.7  89.4  93.40
21 102.1 102.2 100.80
22 116.9 118.9 118.00
23  94.6  95.3  90.30
24 122.2 117.0 117.00
25 105.6 103.3 103.60
26  96.9  96.8  98.80
27 102.9 100.3  89.00
28 115.8 118.5 117.30
29  95.7  96.2  95.40
30  88.2  86.9  88.30
31 108.7 108.8 108.80
32  89.2  88.6  81.20
33  86.8  86.5  82.70
34 135.5 130.1 125.40
35 112.5 113.9 111.45
36 111.0 105.3 109.50
37 103.4 100.5  95.50
38 117.6 117.4 101.40
39 116.7 118.5 101.80

The INDEX is clearly a factor: 
> Arm
 [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA MUFA 
PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA MUFA PUFA
[32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
Levels: MUFA PUFA

The data and the index have the same length:
> cbind(hold,Arm)
   Wtscr   Wt0Wt6  Arm
1  120.2 117.2 114.60 PUFA
2  104.6 105.3 104.80 PUFA
3   97.2  96.2  93.80 MUFA
4  103.9 106.1 101.70 MUFA
5   58.2  56.7  55.50 MUFA
6  130.9 127.4 127.60 MUFA
7  135.0 133.1 130.90 MUFA
8  100.1  99.5  84.50 PUFA
9  130.3 115.3 115.80 MUFA
10 150.5 148.7 133.40 MUFA
11  74.8  75.7  77.70 PUFA
12 112.6 110.7 107.40 PUFA
13  90.0  91.0  83.40 PUFA
14 139.1 138.5 126.70 MUFA
15  99.1  96.4  95.70 MUFA
16 108.3 107.5 109.30 PUFA
17  75.1  72.9  72.20 PUFA
18  97.5 102.1  98.50 PUFA
19 202.2  90.1  90.60 MUFA
20  91.7  89.4  93.40 MUFA
21 102.1 102.2 100.80 MUFA
22 116.9 118.9 118.00 MUFA
23  94.6  95.3  90.30 MUFA
24 122.2 117.0 117.00 PUFA
25 105.6 103.3 103.60 MUFA
26  96.9  96.8  98.80 MUFA
27 102.9 100.3  89.00 PUFA
28 115.8 118.5 117.30 MUFA
29  95.7  96.2  95.40 PUFA
30  88.2  86.9  88.30 MUFA
31 108.7 108.8 108.80 PUFA
32  89.2  88.6  81.20 MUFA
33  86.8  86.5  82.70 MUFA
34 135.5 130.1 125.40 MUFA
35 112.5 113.9 111.45 MUFA
36 111.0 105.3 109.50 MUFA
37 103.4 100.5  95.50 PUFA
38 117.6 117.4 101.40 PUFA
39 116.7 118.5 101.80 PUFA

But the by function does not work!
> by(hold,Arm,mean,na.rm=TRUE)
Arm: MUFA
[1] NA
---
 
Arm: PUFA
[1] NA
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA


Perhaps this is a hint, print does not give two separate group:
> by(hold,Arm,print)
   Wtscr   Wt0Wt6
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60
7  135.0 133.1 130.90
9  130.3 115.3 115.80
10 150.5 148.7 133.40
14 139.1 138.5 126.70
15  99.1  96.4  95.70
19 202.2  90.1  90.60
20  91.7  89.4  93.40
21 102.1 102.2 100.80
22 116.9 118.9 118.00
23  94.6  95.3  90.30
25 105.6 103.3 103.60
26  96.9  96.8  98.80
28 115.8 118.5 117.30
30  88.2  86.9  88.30
32  89.2  88.6  81.20
33  86.8  86.5  82.70
34 135.5 130.1 125.40
35 112.5 113.9 111.45
36 111.0 105.3 109.50
   Wtscr   Wt0   Wt6
1  120.2 117.2 114.6
2  104.6 105.3 104.8
8  100.1  99.5  84.5
11  74.8  75.7  77.7
12 112.6 110.7 107.4
13  90.0  91.0  83.4
16 108.3 107.5 109.3
17  75.1  72.9  72.2
18  97.5 102.1  98.5
24 122.2 117.0 117.0
27 102.9 100.3  89.0
29  95.7  96.2  95.4
31 108.7 108.8 108.8
37 103.4 100.5  95.5
38 117.6 117.4 101.4
39 116.7 118.5 101.8
Arm: MUFA
   Wtscr   Wt0Wt6
3   97.2  96.2  93.80
4  103.9 106.1 101.70
5   58.2  56.7  55.50
6  130.9 127.4 127.60

Re: [R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.

2015-12-27 Thread John Sorkin
Rolf,
Thank you!
John

> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)


> On Dec 27, 2015, at 10:40 PM, Rolf Turner  wrote:
> 
> 
> You are trying to take the mean of data frames.  There is no 
> "data.frame" method for mean().
> 
> Try:
> 
> by(hold,Arm,function(x){sapply(x,mean)})
> 
> 
> BTW what's the point of "na.rm=TRUE" in your call?  There are no missing 
> values in the data that you present.
> 
> In future, please use dput() to present your data; it makes life a lot 
> easier for respondents.
> 
> cheers,
> 
> Rolf
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
>> On 28/12/15 15:55, John Sorkin wrote:
>> When I run by, I get an error message and no results. Any help in 
>> understanding what is wrong would be appreciated.
>> 
>> Error message:
>> Warning messages:
>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>> 
>> 
>> Results:
>> Arm: MUFA
>> [1] NA
>> ---
>> Arm: PUFA
>> [1] NA
>> 
>> Code:
>> by(hold,Arm,mean,na.rm=TRUE)
>> 
>> I don't understand why I am getting the error message, and why I am not 
>> getting any results. I don't believe my data are non-numeric.
>> 
>> BY str works fine and confirms that the data are numeric
>>> by(hold,Arm,str)
>> 'data.frame':23 obs. of  3 variables:
>>  $ Wtscr: num  97.2 103.9 58.2 130.9 135 ...
>>  $ Wt0  : num  96.2 106.1 56.7 127.4 133.1 ...
>>  $ Wt6  : num  93.8 101.7 55.5 127.6 130.9 ...
>> 'data.frame':16 obs. of  3 variables:
>>  $ Wtscr: num  120.2 104.6 100.1 74.8 112.6 ...
>>  $ Wt0  : num  117.2 105.3 99.5 75.7 110.7 ...
>>  $ Wt6  : num  114.6 104.8 84.5 77.7 107.4 ...
>>  Here is a listing of my data:
>>> hold
>>Wtscr   Wt0Wt6
>> 1  120.2 117.2 114.60
>> 2  104.6 105.3 104.80
>> 3   97.2  96.2  93.80
>> 4  103.9 106.1 101.70
>> 5   58.2  56.7  55.50
>> 6  130.9 127.4 127.60
>> 7  135.0 133.1 130.90
>> 8  100.1  99.5  84.50
>> 9  130.3 115.3 115.80
>> 10 150.5 148.7 133.40
>> 11  74.8  75.7  77.70
>> 12 112.6 110.7 107.40
>> 13  90.0  91.0  83.40
>> 14 139.1 138.5 126.70
>> 15  99.1  96.4  95.70
>> 16 108.3 107.5 109.30
>> 17  75.1  72.9  72.20
>> 18  97.5 102.1  98.50
>> 19 202.2  90.1  90.60
>> 20  91.7  89.4  93.40
>> 21 102.1 102.2 100.80
>> 22 116.9 118.9 118.00
>> 23  94.6  95.3  90.30
>> 24 122.2 117.0 117.00
>> 25 105.6 103.3 103.60
>> 26  96.9  96.8  98.80
>> 27 102.9 100.3  89.00
>> 28 115.8 118.5 117.30
>> 29  95.7  96.2  95.40
>> 30  88.2  86.9  88.30
>> 31 108.7 108.8 108.80
>> 32  89.2  88.6  81.20
>> 33  86.8  86.5  82.70
>> 34 135.5 130.1 125.40
>> 35 112.5 113.9 111.45
>> 36 111.0 105.3 109.50
>> 37 103.4 100.5  95.50
>> 38 117.6 117.4 101.40
>> 39 116.7 118.5 101.80
>> 
>> The INDEX is clearly a factor:
>>> Arm
>>  [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA 
>> MUFA PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA 
>> MUFA PUFA
>> [32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
>> Levels: MUFA PUFA
>> 
>> The data and the index have the same length:
>>> cbind(hold,Arm)
>>Wtscr   Wt0Wt6  Arm
>> 1  120.2 117.2 114.60 PUFA
>> 2  104.6 105.3 104.80 PUFA
>> 3   97.2  96.2  93.80 MUFA
>> 4  103.9 106.1 101.70 MUFA
>> 5   58.2  56.7  55.50 MUFA
>> 6  130.9 127.4 127.60 MUFA
>> 7  135.0 133.1 130.90 MUFA
>> 8  100.1  99.5  84.50 PUFA
>> 9  130.3 115.3 115.80 MUFA
>> 10 150.5 148.7 133.40 MUFA
>> 11  74.8  75.7  77.70 PUFA
>> 12 112.6 110.7 107.40 PUFA
>> 13  90.0  91.0  83.40 PUFA
>> 14 139.1 138.5 126.70 MUFA
>> 15  99.1  96.4  95.70 MUFA
>> 16 108.3 107.5 109.30 PUFA
>> 17  75.1  72.9  72.20 PUFA
>> 18  97.5 102.1  98.50 PUFA
>> 19 202.2  90.1  90.60 MUFA
>> 20  91.7  89.4  93.40 MUFA
>> 21 102.1 102.2 100.80 MUFA
>> 22 116.9 118.9 118.00 MUFA
>> 23  94.6  95.3  90.30 MUFA
>> 24 122.2 117.0 117.00 PUFA
>> 25 105.6 103.3 103.60 MUFA
>> 26  96.9  96.8  98.80 MUFA
>> 27 102.9 100.3  89.00 PUFA
>> 28 115.8 118.5 117.30 MUFA
>> 29  95.7  96.2  95.40 PUFA
>> 30  88.2  86.9  88.30 MUFA
>> 31 108.7 108.8 108.80 PUFA
>> 32  89.2  88.6  81.20 MUFA
>> 33  86.8  86.5  82.70 MUFA
>> 34 135.5 130.1 125.40 MUFA
>> 35 112.5 113.9 111.45 MUFA
>> 36 111.0 105.3 109.50 MUFA
>> 37 103.4 100.5  95.50 PUFA
>> 38 117.6 117.4 101.40 PUFA
>> 39 116.7 118.5 101.80 PUFA
>> 
>> But the by function does not work!
>>> by(hold,Arm,mean,na.rm=TRUE)
>> Arm: MUFA
>> [1] NA
>> 

Re: [R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.

2015-12-27 Thread William Dunlap via R-help
by(dataFrame, groupId, FUN) applies FUN a bunch of data.frames (row subsets
of the dataFrame input).  mean() returns NA for data.frames.  You could use
FUN=colMeans if you wanted column means or FUN=function(x)mean(colMeans(x))
or FUN=function(x)mean(unlist(x)) if you wanted some version of a grand mean
over all the columns.

If you want column means, you may find aggregate() more suited to the job,
as it
applies FUN to each column in each row subset of the data and returns a
data.frame
instead of a list of outputs of FUN.
  > aggregate(mtcars[,3:5], mtcars[,2,drop=FALSE], mean)
cyl disphp drat
  1   4 105.1364  82.63636 4.070909
  2   6 183.3143 122.28571 3.585714
  3   8 353.1000 209.21429 3.229286



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sun, Dec 27, 2015 at 6:55 PM, John Sorkin 
wrote:

> When I run by, I get an error message and no results. Any help in
> understanding what is wrong would be appreciated.
>
> Error message:
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
>
>
> Results:
> Arm: MUFA
> [1] NA
>
> ---
> Arm: PUFA
> [1] NA
>
> Code:
> by(hold,Arm,mean,na.rm=TRUE)
>
> I don't understand why I am getting the error message, and why I am not
> getting any results. I don't believe my data are non-numeric.
>
> BY str works fine and confirms that the data are numeric
> > by(hold,Arm,str)
> 'data.frame':   23 obs. of  3 variables:
>  $ Wtscr: num  97.2 103.9 58.2 130.9 135 ...
>  $ Wt0  : num  96.2 106.1 56.7 127.4 133.1 ...
>  $ Wt6  : num  93.8 101.7 55.5 127.6 130.9 ...
> 'data.frame':   16 obs. of  3 variables:
>  $ Wtscr: num  120.2 104.6 100.1 74.8 112.6 ...
>  $ Wt0  : num  117.2 105.3 99.5 75.7 110.7 ...
>  $ Wt6  : num  114.6 104.8 84.5 77.7 107.4 ...
>  Here is a listing of my data:
> > hold
>Wtscr   Wt0Wt6
> 1  120.2 117.2 114.60
> 2  104.6 105.3 104.80
> 3   97.2  96.2  93.80
> 4  103.9 106.1 101.70
> 5   58.2  56.7  55.50
> 6  130.9 127.4 127.60
> 7  135.0 133.1 130.90
> 8  100.1  99.5  84.50
> 9  130.3 115.3 115.80
> 10 150.5 148.7 133.40
> 11  74.8  75.7  77.70
> 12 112.6 110.7 107.40
> 13  90.0  91.0  83.40
> 14 139.1 138.5 126.70
> 15  99.1  96.4  95.70
> 16 108.3 107.5 109.30
> 17  75.1  72.9  72.20
> 18  97.5 102.1  98.50
> 19 202.2  90.1  90.60
> 20  91.7  89.4  93.40
> 21 102.1 102.2 100.80
> 22 116.9 118.9 118.00
> 23  94.6  95.3  90.30
> 24 122.2 117.0 117.00
> 25 105.6 103.3 103.60
> 26  96.9  96.8  98.80
> 27 102.9 100.3  89.00
> 28 115.8 118.5 117.30
> 29  95.7  96.2  95.40
> 30  88.2  86.9  88.30
> 31 108.7 108.8 108.80
> 32  89.2  88.6  81.20
> 33  86.8  86.5  82.70
> 34 135.5 130.1 125.40
> 35 112.5 113.9 111.45
> 36 111.0 105.3 109.50
> 37 103.4 100.5  95.50
> 38 117.6 117.4 101.40
> 39 116.7 118.5 101.80
>
> The INDEX is clearly a factor:
> > Arm
>  [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA
> MUFA PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA
> MUFA PUFA
> [32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
> Levels: MUFA PUFA
>
> The data and the index have the same length:
> > cbind(hold,Arm)
>Wtscr   Wt0Wt6  Arm
> 1  120.2 117.2 114.60 PUFA
> 2  104.6 105.3 104.80 PUFA
> 3   97.2  96.2  93.80 MUFA
> 4  103.9 106.1 101.70 MUFA
> 5   58.2  56.7  55.50 MUFA
> 6  130.9 127.4 127.60 MUFA
> 7  135.0 133.1 130.90 MUFA
> 8  100.1  99.5  84.50 PUFA
> 9  130.3 115.3 115.80 MUFA
> 10 150.5 148.7 133.40 MUFA
> 11  74.8  75.7  77.70 PUFA
> 12 112.6 110.7 107.40 PUFA
> 13  90.0  91.0  83.40 PUFA
> 14 139.1 138.5 126.70 MUFA
> 15  99.1  96.4  95.70 MUFA
> 16 108.3 107.5 109.30 PUFA
> 17  75.1  72.9  72.20 PUFA
> 18  97.5 102.1  98.50 PUFA
> 19 202.2  90.1  90.60 MUFA
> 20  91.7  89.4  93.40 MUFA
> 21 102.1 102.2 100.80 MUFA
> 22 116.9 118.9 118.00 MUFA
> 23  94.6  95.3  90.30 MUFA
> 24 122.2 117.0 117.00 PUFA
> 25 105.6 103.3 103.60 MUFA
> 26  96.9  96.8  98.80 MUFA
> 27 102.9 100.3  89.00 PUFA
> 28 115.8 118.5 117.30 MUFA
> 29  95.7  96.2  95.40 PUFA
> 30  88.2  86.9  88.30 MUFA
> 31 108.7 108.8 108.80 PUFA
> 32  89.2  88.6  81.20 MUFA
> 33  86.8  86.5  82.70 MUFA
> 34 135.5 130.1 125.40 MUFA
> 35 112.5 113.9 111.45 MUFA
> 36 111.0 105.3 109.50 MUFA
> 37 103.4 100.5  95.50 PUFA
> 38 117.6 117.4 101.40 PUFA
> 39 116.7 118.5 101.80 PUFA
>
> But the by function does not work!
> > by(hold,Arm,mean,na.rm=TRUE)
> Arm: MUFA
> [1] NA
>
> ---
> Arm: PUFA
> [1] NA
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is