Try this using built in data frame iris: > length(subset(iris, Sepal.Length >= 7, Sepal.Width)[[1]]) [1] 13 > length(subset(iris, Sepal.Length >= 7 & Species == 'virginica', > Sepal.Width)[[1]]) [1] 12
> # or the following (note that dot in Sepal.Length is automatically > # converted to _ because dot has special meaning in sql) > library(sqldf) > sqldf("select count(*) from iris where Sepal_Length >= 7") count(*) 1 13 > sqldf("select count(*) from iris where Sepal_Length >= 7 and Species = > 'virginica'") count(*) 1 12 For the second part use cut to create a factor with the levels you want iris$Sepal.Length.factor <- cut(iris$Sepal.Length, 4:8) and then summarize as desired using sql such as: > sqldf("select Sepal_Length_factor, avg(Sepal_Length), count(Sepal_Length) > from iris group by Sepal_Length_factor") Sepal_Length_factor avg(Sepal_Length) count(Sepal_Length) 1 (4,5] 4.787500 32 2 (5,6] 5.550877 57 3 (6,7] 6.473469 49 4 (7,8] 7.475000 12 or use summaryBy the in the doBy package. See ?cut, ?subset, and in doBy see ?summaryBy Also see http://sqldf.googlecode.com On Tue, Aug 4, 2009 at 11:40 PM, Noah Silverman<n...@smartmediacorp.com> wrote: > I've completed an experiment and want to summarize the results. > > There are two things I like to create. > > 1) A simple count of things from the data.frame with predictions > 1a) Number of predictions with probability greater than x > 1b) Number of predictions with probability greater than x that are really > true > > In SQL, this would be, > "Select count(predictions) from data.frame where probability > x" > "Select count(predictions) from data.frame where probability > x and label > ='T' " > > How can I do this one in R? > > > 2) I'd like to create what we call "binning". It is a simple list of > probability ranges and how accurate our model is. The idea is to see how > "true" our probabilities are. > for example > > range number of items mean(probability) true_accuracy > 100-90% 20 .924 .90 > 90-80% 50 .825 .84 > 80-70% 214 .75 .71 > etc... > > It would be really great if I could also graph this! > > Is there any kind of package or way to do this in R > > Thanks! > > -N > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.