[R] Interaction between aggregate() and length()

2008-08-28 Thread Seeliger . Curt
Folks,

I've been running into an odd situation that occurs when I use length() 
function with aggregate(), but not with either one separately.  Together, 
the results looks correct but is given an unexpected name. 'if 
(stringsAsFactors) factor(x) else x' instead of just 'x'.

# Numbers work ok
tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c(1,3,5,7,5,5,2,4,8,16,4,4)
,t=c(1,3,5,7,5,5,2,4,8,16,4,4)
,stringsAsFactors=FALSE)

aggregate(tt$t, list('idx'=tt$idx), length)
aggregate(as.factor(tt$t), list('idx'=tt$idx), length)

# Character data doesn't work right unless I convert the data to factors.
tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c('1','3','5','7','5','5','2','4','8','16','4','4')
,t=c('1','3','5','7','5','5','2','4','8','16','4','4')
,stringsAsFactors=FALSE)

aggregate(tt$t, list('idx'=tt$idx), length)
aggregate(as.factor(tt$t), list('idx'=tt$idx), length)


Any idea what is going on here?  For the record, this also happens with 
the modalvalue() function defined at 
http://wiki.r-project.org/rwiki/doku.php?id=tips:stats-basic:modalvalue 
(which also relies on length() ).

As a side note, this began as an attempt to determine sample size, for 
which I've defined a function count - function(x) { length(na.omit(x)) }. 
 No doubt there's a built in function to do just that, but as a newbie 
I've yet to find it.

Thank you for your help,
cur
-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
[EMAIL PROTECTED]
541/754-4638

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interaction between aggregate() and length()

2008-08-28 Thread Henrique Dallazuanna
One option is use this:

aggregate(list(t=tt$t), list(idx=tt$idx), length)

On Thu, Aug 28, 2008 at 4:36 PM, [EMAIL PROTECTED] wrote:

 Folks,

 I've been running into an odd situation that occurs when I use length()
 function with aggregate(), but not with either one separately.  Together,
 the results looks correct but is given an unexpected name. 'if
 (stringsAsFactors) factor(x) else x' instead of just 'x'.

 # Numbers work ok
 tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c(1,3,5,7,5,5,2,4,8,16,4,4)
,t=c(1,3,5,7,5,5,2,4,8,16,4,4)
,stringsAsFactors=FALSE)

 aggregate(tt$t, list('idx'=tt$idx), length)
 aggregate(as.factor(tt$t), list('idx'=tt$idx), length)

 # Character data doesn't work right unless I convert the data to factors.
 tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c('1','3','5','7','5','5','2','4','8','16','4','4')
,t=c('1','3','5','7','5','5','2','4','8','16','4','4')
,stringsAsFactors=FALSE)

 aggregate(tt$t, list('idx'=tt$idx), length)
 aggregate(as.factor(tt$t), list('idx'=tt$idx), length)


 Any idea what is going on here?  For the record, this also happens with
 the modalvalue() function defined at
 http://wiki.r-project.org/rwiki/doku.php?id=tips:stats-basic:modalvalue
 (which also relies on length() ).

 As a side note, this began as an attempt to determine sample size, for
 which I've defined a function count - function(x) { length(na.omit(x)) }.
  No doubt there's a built in function to do just that, but as a newbie
 I've yet to find it.

 Thank you for your help,
 cur
 --
 Curt Seeliger, Data Ranger
 Raytheon Information Services - Contractor to ORD
 [EMAIL PROTECTED]
 541/754-4638

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interaction between aggregate() and length()

2008-08-28 Thread Seeliger . Curt
That's a great work around, as I can eliminate renaming the results column 
from 'x' to whatever.  Thanks for the quick tip, Henrique.

On the other hand, I'm still stumped as to why aggregate() would name an 
output column as 'if (stringsAsFactors) factor(x) else x'.  That sort of 
behaviour seems to contrdict the principle of least astonishment.

Enjoy your days,
cur

Henrique Dallazuanna [EMAIL PROTECTED] wrote on 08/28/2008 01:52:03 PM:

 One option is use this:
 
 aggregate(list(t=tt$t), list(idx=tt$idx), length)

 On Thu, Aug 28, 2008 at 4:36 PM, [EMAIL PROTECTED] wrote:
 Folks,
 
 I've been running into an odd situation that occurs when I use length()
 function with aggregate(), but not with either one separately. Together,
 the results looks correct but is given an unexpected name. 'if
 (stringsAsFactors) factor(x) else x' instead of just 'x'.
 
 # Numbers work ok
 tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c(1,3,5,7,5,5,2,4,8,16,4,4)
,t=c(1,3,5,7,5,5,2,4,8,16,4,4)
,stringsAsFactors=FALSE)
 
 aggregate(tt$t, list('idx'=tt$idx), length)
 aggregate(as.factor(tt$t), list('idx'=tt$idx), length)
 
 # Character data doesn't work right unless I convert the data to 
factors.
 tt - data.frame(idx=c(1,1,1,1,1,1,2,2,2,2,2,2)
,n=c('1','3','5','7','5','5','2','4','8','16','4','4')
,t=c('1','3','5','7','5','5','2','4','8','16','4','4')
,stringsAsFactors=FALSE)
 
 aggregate(tt$t, list('idx'=tt$idx), length)
 aggregate(as.factor(tt$t), list('idx'=tt$idx), length)
 
 
 Any idea what is going on here?  For the record, this also happens with
 the modalvalue() function defined at
 http://wiki.r-project.org/rwiki/doku.php?id=tips:stats-basic:modalvalue
 (which also relies on length() ).
 
 As a side note, this began as an attempt to determine sample size, for
 which I've defined a function count - function(x) { length(na.omit(x)) 
}.
  No doubt there's a built in function to do just that, but as a newbie
 I've yet to find it.
 
 Thank you for your help,
 cur


-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
[EMAIL PROTECTED]
541/754-4638
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.