On Wed, 2006-10-11 at 14:25 -0400, Brian Frappier wrote: > I tried all of the approaches below. > > the problem with: > > > x <- data.frame(matrix(NA,100,3)) > > for (i in 2:ncol(DF)) x[,i-1] <- sample(rep(DF[,1], DF[,i]),100) > > if you want result in data frame > > or > > x<-vector("list", 3) > > for (i in 2:ncol(DF)) x[[,i-1]] <- sample(rep(DF[,1], DF[,i]),100) > > is that this code still samples the rows, not the elements, i.e. returns 100 > or 300 in the matrix cells instead of "red" or a matrix of counts by color > (object type) like: > x1 x2 x3 > red 32 5 60 > gr 68 95 40 > sum 100 100 100 > > It looks like Tony is right: sampling without replacement requires listing > of all elements to be sampled.
<snip> How about the following approach which generates a new sample using the rMultinom function from Hmisc. library(Hmisc) data <- matrix(c(400, 300, 2500, 100, 25, 200, 300, 1000, 500), nrow=3, byrow=TRUE) col.sums <- apply(data,2,sum) probs <- t(data)/col.sums w <- rMultinom(probs,100) apply(w, 1, table) Note that I replaced the zero in your example data set with 25 because the table function doesn't seem to output the results nicely when there are zero values. HTH, Manuel > On 10/11/06, Tony Plate <[EMAIL PROTECTED]> wrote: > > > > Here's a way using apply(), and the prob= argument of sample(): > > > > > df <- data.frame(sample1=c(red=400,green=100,black=300), > > sample2=c(300,0,1000), sample3=c(2500,200,500)) > > > df > > sample1 sample2 sample3 > > red 400 300 2500 > > green 100 0 200 > > black 300 1000 500 > > > set.seed(1) > > > apply(df, 2, function(counts) sample(seq(along=counts), rep=T, > > size=7, prob=counts)) > > sample1 sample2 sample3 > > [1,] 1 3 1 > > [2,] 1 3 1 > > [3,] 3 3 1 > > [4,] 2 3 2 > > [5,] 1 3 1 > > [6,] 2 3 1 > > [7,] 2 3 3 > > > > > > > Note that this does sampling WITH replacement. > > AFAIK, sampling without replacement requires enumerating the entire > > population to be sampled from. I.e., you cannot do > > > sample(1:3, prob=1:3, rep=F, size=4) > > instead of > > > sample(c(1,2,2,3,3,3), rep=F, size=4) > > > > -- Tony Plate > > > > From reading ?sample, I was a little unclear on whether sampling > > without replacement could work > > > > Petr Pikal wrote: > > > Hi > > > > > > a litle bit different story. But > > > > > > x1 <- sample(c(rep("red",400),rep("green", 100), > > > rep("black",300)),100) > > > > > > is maybe close. With data frame (if it is not big) > > > > > > > > >>DF > > > > > > color sample1 sample2 sample3 > > > 1 red 400 300 2500 > > > 2 green 100 0 200 > > > 3 black 300 1000 500 > > > > > > x <- data.frame(matrix(NA,100,3)) > > > for (i in 2:ncol(DF)) x[,i-1] <- sample(rep(DF[,1], DF[,i]),100) > > > if you want result in data frame > > > or > > > x<-vector("list", 3) > > > for (i in 2:ncol(DF)) x[[,i-1]] <- sample(rep(DF[,1], DF[,i]),100) > > > > > > if you want it in list. Maybe somebody is clever enough to discard > > > for loop but you said you have 80 columns which shall be no problem. > > > > > > HTH > > > Petr > > > > > > > > > > > > > > > > > > > > > > > > On 11 Oct 2006 at 10:11, Brian Frappier wrote: > > > > > > Date sent: Wed, 11 Oct 2006 10:11:33 -0400 > > > From: "Brian Frappier" <[EMAIL PROTECTED]> > > > To: "Petr Pikal" <[EMAIL PROTECTED]> > > > Subject: Fwd: [R] rarefy a matrix of counts > > > > > > > > >>---------- Forwarded message ---------- > > >>From: Brian Frappier <[EMAIL PROTECTED]> > > >>Date: Oct 11, 2006 10:10 AM > > >>Subject: Re: [R] rarefy a matrix of counts > > >>To: r-help@stat.math.ethz.ch > > >> > > >>Hi Petr, > > >> > > >>Thanks for your response. I have data that looks like the following: > > >> > > >> sample 1 sample 2 sample 3 .... > > >>red candy 400 300 2500 > > >>green candy 100 0 200 > > >>black candy 300 1000 500 > > >> > > >>I don't want to randomly select either the samples (columns) or the > > >>"candy" types (rows), which sample as you state would allow me. > > >>Instead, I want to randomly sample 100 candies from each sample and > > >>retain info on their associated type. I could make a list of all the > > >>candies in each sample: > > >> > > >>sample 1 > > >>red > > >>red > > >>red > > >>red > > >>green > > >>green > > >>black > > >>red > > >>black > > >>... > > >> > > >>and then randomly sample those rows. Repeat for each sample. But, I > > >>am not sure how to do that without alot of loops, and am wondering if > > >>there is an easier way in R. Thanks! I should have laid this out in > > >>the first email...sorry. > > >> > > >> > > >>On 10/11/06, Petr Pikal <[EMAIL PROTECTED]> wrote: > > >> > > >>>Hi > > >>> > > >>>I am not experienced in Matlab and from your explanation I do not > > >>>understand what exactly do you want. It seems that you want randomly > > >>>choose a sample of 100 rows from your martix, what can be achived by > > >>>sample. > > >>> > > >>>DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300) > > >>>DF[sample(1:100, 10),] > > >>> > > >>>If you want to do this several times, you need to save your result > > >>>and than it depends on what you want to do next. One suitable form > > >>>is list of matrices the other is array and you can use for loop for > > >>>completing it. > > >>> > > >>>HTH > > >>>Petr > > >>> > > >>> > > >>>On 10 Oct 2006 at 17:40, Brian Frappier wrote: > > >>> > > >>>Date sent: Tue, 10 Oct 2006 17:40:47 -0400 > > >>>From: "Brian Frappier" <[EMAIL PROTECTED]> > > >>>To: r-help@stat.math.ethz.ch Subject: > > >>> [R] rarefy a matrix of counts > > >>> > > >>> > > >>>>Hi all, > > >>>> > > >>>>I have a matrix of counts for objects (rows) by samples (columns). > > >>>> I aimed for about 500 counts in each sample (I have about 80 > > >>>>samples) and would now like to rarefy these down to 100 counts in > > >>>>each sample using simple random sampling without replacement. I > > >>>>plan on rarefying several times for each sample. I could do the > > >>>>tedious looping task of making a list of all objects (with its > > >>>>associated identifier) in each sample and then use the wonderful > > >>>>"sampling" package to select a sub-sample of 100 for each sample > > >>>>and thereby get a logical vector of inclusions. I would then > > >>>>regroup the resulting logical vector into a vector of counts by > > >>>>object, rinse and repeat several times for each sample. > > >>>> > > >>>>Alternately, using the same list, I could create a random index of > > >>>>integers between 1 and the number of objects for a sample (without > > >>>>repeats) and then select those objects from the list. Again, > > >>>>rinse and repeat several time for each sample. > > >>>> > > >>>>Is there a way to directly rarefy a matrix of counts without > > >>>>having to create a list of objects first? I am trying to switch > > >>>>to R from Matlab and am trying to pick up good programming habits > > >>>>from the start. > > >>>> > > >>>>Much appreciation! > > >>>> > > >>>> [[alternative HTML version deleted]] > > >>>> > > >>>>______________________________________________ > > >>>>R-help@stat.math.ethz.ch mailing list > > >>>>https://stat.ethz.ch/mailman/listinfo/r-help > > >>>>PLEASE do read the posting guide > > >>>>http://www.R-project.org/posting-guide.html and provide commented, > > >>>>minimal, self-contained, reproducible code. > > >>> > > >>>Petr Pikal > > >>>[EMAIL PROTECTED] > > >>> > > >>> > > >> > > > > > > Petr Pikal > > > [EMAIL PROTECTED] > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Manuel A. Morales http://mutualism.williams.edu
signature.asc
Description: This is a digitally signed message part
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.