I have a data frame with two columns, a factor and a numeric. I want to create
data frame with the factor, its frequency and the median of the numeric column
> head(motifList)
events score
1 aeijm -0.25000000
2 begjm -0.25000000
3 afgjm -0.25000000
4 afhjm -0.25000000
5 aeijm -0.25000000
6 aehjm 0.08333333
To get the frequency table of events:
> motifTable <- as.data.frame(table(motifList$events))
> head(motifTable)
Var1 Freq
1 aeijm 110
2 begjm 46
3 afgjm 337
4 afhjm 102
5 aehjm 190
6 adijm 18
>
Now get the score column back in.
> motifTable2 <- merge(motifList, motifTable, by="events")
> head(motifTable2)
events percent freq
1 adgjm 0.00000000 111
2 adgjm NA 111
3 adgjm 0.13333333 111
4 adgjm 0.06666667 111
5 adgjm -0.16666667 111
6 adgjm NA 111
>
Then lastly to aggregate on the events column getting the median of the score
> motifTable3 <- aggregate.data.frame(motifTable2, by=list(motifTable2$events),
> FUN=median, na.rm=TRUE)
Error in median.default(X[[1L]], ...) : need numeric data
Which gives the error as events are a factor. Can someone enlighten me to a
more obvious approach?
dhs
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.