[EMAIL PROTECTED] wrote [that he has a data set with 9 variables (columns) measured on 2000 individuals (rows) and wants a sample] in which the sum of the volume of the individuals in that sample >= 100 cubic m.
Let's suppose that this information is held in d, a data frame, and that the volume column is d$vol. If sum(d$vol) < 100, there is no sample which satisfies your condition. If sum(d$vol) >= 100, then d is such a sample as it stands. If you want the smallest number of rows, then indices <- order(d$vol, decreasing=TRUE) gives you the row indices sorted by decreasing volume; d$vol[indices] => the volumes in decreasing order cumsum(") => the cumulative sum sum(" < 100.0) => 1 less than then number of rows you want so indices <- order(d$vol, decreasing=TRUE) d[indices[1:(sum(cumsum(d$vol[indices]) < 100.0) + 1)]] should be the answer you want. This is O(n.lg n) where n is the number of rows; in your case n is 2000. If you don't need the smallest sample, but just any old haphazard answer, indices <- sample(nrow(d)) d[indices[1:(sum(cumsum(d$vol[indices]) < 100.0) + 1)]] should be useful. ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help