[R] Using nrow with summaryBy

2010-03-17 Thread Tony Laidig
Hello Everyone-
I'm calculating summary statistics on a dataset (~4000 records, 
observations are not uniformly distributed) using summaryBy and trying 
to add a column with the number of observations to the output as well. 
What occurs to me is to use nrow(), but this doesn't appear to be working

I'm able to replicate the same results with an example from the 
summaryBy docs:

data(dietox)
dietox12- subset(dietox,Time==12)
library(doBy)
#this one works
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length))
#adding nrow doesn't give the number of rows
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length,nrow))


There must be a way to do this, but I can't figure it out. I suspect 
there is another function that would be compatible with summaryBy.

Thanks in advance.
-Tony




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using nrow with summaryBy

2010-03-17 Thread David Winsemius


On Mar 17, 2010, at 11:23 AM, Tony Laidig wrote:


Hello Everyone-
I'm calculating summary statistics on a dataset (~4000 records,
observations are not uniformly distributed) using summaryBy and trying
to add a column with the number of observations to the output as well.
What occurs to me is to use nrow(), but this doesn't appear to be  
working


I'm able to replicate the same results with an example from the
summaryBy docs:

data(dietox)
dietox12- subset(dietox,Time==12)
library(doBy)
#this one works
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length))
#adding nrow doesn't give the number of rows
summaryBy(Weight+Feed~Evit 
+Cu,data=dietox12,FUN=c(mean,var,length,nrow))




I'm a bit puzzled. One of my many newbie mistakes was to assume that  
length() applied to dataframes would tell me how many rows it had. It  
appears that the authors of summaryBy have figured out how to get  
length() to tell you the number of observations, presumably on a  
subsetted vector where length would make sense.  So ...  it's not  
clear why you also want nrow (which would not make sense for a  
subsetted vector).





There must be a way to do this, but I can't figure it out. I suspect
there is another function that would be compatible with summaryBy.

Thanks in advance.
-Tony



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using nrow with summaryBy

2010-03-17 Thread Ivan Calandra

Hi David,

I have probably 2 stupid questions regarding what you said but it might 
be important to understand:


- why nrow() would not make sens for a subsetted vector?
On the help page of nrow(), it's written that we can apply it on a 
vector, array or dataframe (basically everything...?). So what's the 
difference between a normal vector (for which it would make sense and 
work) and a subsetted vector?


- why assuming that length() applied to dataframes would tell me how 
many rows it had would be a mistake? I mean in this case, length() is 
calculated for each numerical variable (which are vectors, aren't they?).


I think these questions concern the way R handle the data and that's why 
I think it might be important for me to understand these issues.


Thanks for your input.
Regards,
Ivan

Le 3/17/2010 16:39, David Winsemius a écrit :


On Mar 17, 2010, at 11:23 AM, Tony Laidig wrote:


Hello Everyone-
I'm calculating summary statistics on a dataset (~4000 records,
observations are not uniformly distributed) using summaryBy and trying
to add a column with the number of observations to the output as well.
What occurs to me is to use nrow(), but this doesn't appear to be 
working


I'm able to replicate the same results with an example from the
summaryBy docs:

data(dietox)
dietox12- subset(dietox,Time==12)
library(doBy)
#this one works
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length))
#adding nrow doesn't give the number of rows
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length,nrow))



I'm a bit puzzled. One of my many newbie mistakes was to assume that 
length() applied to dataframes would tell me how many rows it had. It 
appears that the authors of summaryBy have figured out how to get 
length() to tell you the number of observations, presumably on a 
subsetted vector where length would make sense.  So ...  it's not 
clear why you also want nrow (which would not make sense for a 
subsetted vector).





There must be a way to do this, but I can't figure it out. I suspect
there is another function that would be compatible with summaryBy.

Thanks in advance.
-Tony



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using nrow with summaryBy

2010-03-17 Thread David Winsemius


On Mar 17, 2010, at 12:10 PM, Ivan Calandra wrote:


Hi David,

I have probably 2 stupid questions regarding what you said but it  
might be important to understand:


- why nrow() would not make sens for a subsetted vector?
On the help page of nrow(), it's written that we can apply it on a  
vector, array or dataframe (basically everything...?). So what's the  
difference between a normal vector (for which it would make sense  
and work) and a subsetted vector?


 nrow(c(0,1,3,4))
NULL
 nrow(1:12)
NULL

(It did not throw an error but the help page does say the value could  
be NULL.)


 length(1:12)
[1] 12



- why assuming that length() applied to dataframes would tell me  
how many rows it had would be a mistake? I mean in this case,  
length() is calculated for each numerical variable (which are  
vectors, aren't they?).


length applied to any list is the number of elements at the first  
level. Dataframes are lists of vectors so length applied to  
data.frames gives you the number of columns, not the length of an  
individual vector in the dataframe.




I think these questions concern the way R handle the data and that's  
why I think it might be important for me to understand these issues.


It's important, fur sure.



Thanks for your input.
Regards,
Ivan

Le 3/17/2010 16:39, David Winsemius a écrit :


On Mar 17, 2010, at 11:23 AM, Tony Laidig wrote:


Hello Everyone-
I'm calculating summary statistics on a dataset (~4000 records,
observations are not uniformly distributed) using summaryBy and  
trying
to add a column with the number of observations to the output as  
well.
What occurs to me is to use nrow(), but this doesn't appear to be  
working


I'm able to replicate the same results with an example from the
summaryBy docs:

data(dietox)
dietox12- subset(dietox,Time==12)
library(doBy)
#this one works
summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length))
#adding nrow doesn't give the number of rows
summaryBy(Weight+Feed~Evit 
+Cu,data=dietox12,FUN=c(mean,var,length,nrow))




I'm a bit puzzled. One of my many newbie mistakes was to assume  
that length() applied to dataframes would tell me how many rows it  
had. It appears that the authors of summaryBy have figured out how  
to get length() to tell you the number of observations, presumably  
on a subsetted vector where length would make sense.  So ...  it's  
not clear why you also want nrow (which would not make sense for a  
subsetted vector).





There must be a way to do this, but I can't figure it out. I suspect
there is another function that would be compatible with summaryBy.

Thanks in advance.
-Tony



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using nrow with summaryBy

2010-03-17 Thread Gabor Grothendieck
Use NROW rather than nrow.

On Wed, Mar 17, 2010 at 11:23 AM, Tony Laidig c...@mit.edu wrote:
 Hello Everyone-
 I'm calculating summary statistics on a dataset (~4000 records,
 observations are not uniformly distributed) using summaryBy and trying
 to add a column with the number of observations to the output as well.
 What occurs to me is to use nrow(), but this doesn't appear to be working

 I'm able to replicate the same results with an example from the
 summaryBy docs:

 data(dietox)
 dietox12- subset(dietox,Time==12)
 library(doBy)
 #this one works
 summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length))
 #adding nrow doesn't give the number of rows
 summaryBy(Weight+Feed~Evit+Cu,data=dietox12,FUN=c(mean,var,length,nrow))


 There must be a way to do this, but I can't figure it out. I suspect
 there is another function that would be compatible with summaryBy.

 Thanks in advance.
 -Tony




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.