Re: [R] New Sampling question
This can of course be done, but before I make any attempt to do it I have to ask: why do you want this? On Wed, Nov 17, 2010 at 7:08 PM, wangwallace talentt...@gmail.com wrote: I have another question about drawing samples from a data frame. This might sound really tricky. Let me use a data frame I have posted earlier as an example: SubID CSE1 CSE2 CSE3 CSE4 WSE1 WSE2 WSE3 WSE4 1 6 5 6 2 6 2 2 4 2 6 4 7 2 6 6 2 3 3 5 5 5 5 5 5 4 5 4 5 4 3 4 4 4 5 2 5 5 6 7 5 6 4 4 1 6 5 4 3 6 4 3 7 3 7 3 6 6 3 6 5 2 1 8 3 6 6 3 6 5 4 7 this data frame have two sets of variables. each set simply represent one scale. as shown above, the first scale, say CSE, consists of four items: CSE1, CSE2, CSE3, and CSE4, whereas the second scale, say WSE, also has four items: WSE1, WSE2, WSE3, WSE4. the leftmost column lists the subjects' ID. I wanna create a new data frame through sampling random numbers from the data frame above. Below is the structure of the new data frame. SubID var var var var s c c c c s c c c c s c w w w s c w w w s c w w w s c w w w s c w w w s c w w w in the new data frame: s= SubID range from 1 to 8 var= variables c=CSE numbers w=WSE numbers some rules to construct the new data frame: 1. the top two rows have to be filled with CSE numbers; the numbers in the cells of each row should be randomized. for example, if the first row is an array of numbers from subject 4, they can follow the order: 4(CSE2), 5(CSE1), 3(CSE3), and 4(CSE4). Also, the numbers in the second row does not have to follow the order of the first row. for example, similarly, if the first row is an array of numbers from subject 4 in the order: 4(CSE2), 5(CSE1), 3(CSE3), and 4(CSE4), numbers in the second row (assuming it is from subject 8) does not have to be 6(CSE2), 3(CSE1), 6(CSE3), and 3(CSE4). numbers in these two rows should be drawn without replacement. 2. each of the rest of the rows should include a CSE number in the leftmost cell and three WSE numbers on the right. At the same time, in each row, the three WSE numbers on the right have to be only those numbers that are not corresponding to the CSE number in the leftmost cell. For example, if the CSE number in the leftmost cell is 4, a CSE2 number from subject 6, the three WSE numbers on the right side can only be 4(WSE1), 7(WSE3), and 3(WSE4) from subject 6. 3. the numbers in each row can only be drawn from the same subject. Also, Subjects should be randomized. Specifically, they does have to be in the following order: SubID 1 2 3 4 5 6 7 8 they can be: SubID 2 8 5 4 1 6 7 3 Any ideas? Thanks in advance!! :) -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3047885.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
Dear Ista Zahn-2, If you can give me some advice, I really appreciate it. I have been working on it for days. it seems hard for some novice of R like me to write flexible functions myself. This is for my dissertation. CSE and WSE are two scales of the same construct. The sampling strategy I wanted above allows me check how the items of these two scales vary within person and across person. Also, I added another rule: draw 1000 random samples... Again, Thanks! Wallace -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048948.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
Hi Wallace, Have you tried playing with sample()? Note that you can apply this function both to whole dataframes, as well as specific items within a vector. If you play with applying the function to different ways of indexing your sample data, you will likely arrive at your solution. for example: a-data.frame(c(1:10),c(21:30)) sample(a[,2],2) #randomly draw two numbers from a column sample(a[2,],2) #randomly draw two numbers from a row a[sample(nrow(a),5,replace=T),] #ranomly draw four whole rows with replacement You may also find subet() helpful, based on your description. HTH, Mike On Thu, Nov 18, 2010 at 9:43 AM, wangwallace talentt...@gmail.com wrote: Dear Ista Zahn-2, If you can give me some advice, I really appreciate it. I have been working on it for days. it seems hard for some novice of R like me to write flexible functions myself. This is for my dissertation. CSE and WSE are two scales of the same construct. The sampling strategy I wanted above allows me check how the items of these two scales vary within person and across person. Also, I added another rule: draw 1000 random samples... Again, Thanks! Wallace -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048948.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Rennie, Research Scientist Fisheries and Oceans Canada, Freshwater Institute Winnipeg, Manitoba, CANADA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
Also, I need some function at the end which would enable me to draw 1000 such random samples. thanks! :) -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
You could try writing a loop a-data.frame(c(1:10),c(21:30)) M-10 #number of iterations- scale up to 1000 once you get your sampling function working res-NULL #place to store your results for i in (1:M) { ares-sample(a[,2],1) res-c(res, ares) } res It's up to you how to store your results- you can do it as a list if you want, then you can get at each of your 1000 results. I've provided a simple example where you just add each result to the end of your vector. Note also that someone will also take issue with the way I'm assigning results in the loop- I know I've seen it written elsewhere that this is not a very elegant way of approaching the problem (particularly in terms of efficiency), but it will work. I just can't recall the other way of doing things off the top of my head- if someone else would like to chime in, be my guest. Mike On Thu, Nov 18, 2010 at 9:46 AM, wangwallace talentt...@gmail.com wrote: Also, I need some function at the end which would enable me to draw 1000 such random samples. thanks! :) -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Rennie, Research Scientist Fisheries and Oceans Canada, Freshwater Institute Winnipeg, Manitoba, CANADA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
Hi, Mike, thank you very much!!:) the following two functions are really helpful, which I didn't even know. Actually, I searched the forum for something like this, but failed. Now I am still trying to make up my own functions. :) sample(a[,2],2) #randomly draw two numbers from a column sample(a[2,],2) #randomly draw two numbers from a row -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3049083.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New Sampling question
I spent the whole afternoon on it, but there is still no progress. I wish I could take some courses... :( -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3049796.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New Sampling question
I have another question about drawing samples from a data frame. This might sound really tricky. Let me use a data frame I have posted earlier as an example: SubIDCSE1 CSE2 CSE3 CSE4 WSE1 WSE2 WSE3 WSE4 1 6 5 6 2 6 22 4 2 6 4 7 2 6 62 3 3 5 5 5 5 5 54 5 4 5 4 3 4 4 45 2 5 5 6 7 5 6 44 1 6 5 4 3 6 4 37 3 7 3 6 6 3 6 52 1 8 3 6 6 3 6 54 7 this data frame have two sets of variables. each set simply represent one scale. as shown above, the first scale, say CSE, consists of four items: CSE1, CSE2, CSE3, and CSE4, whereas the second scale, say WSE, also has four items: WSE1, WSE2, WSE3, WSE4. the leftmost column lists the subjects' ID. I wanna create a new data frame through sampling random numbers from the data frame above. Below is the structure of the new data frame. SubIDvarvar var var s c c c c s c c c c s c w w w s c w w w s c w w w s c w w w s c w w w s c w w w in the new data frame: s= SubID range from 1 to 8 var= variables c=CSE numbers w=WSE numbers some rules to construct the new data frame: 1. the top two rows have to be filled with CSE numbers; the numbers in the cells of each row should be randomized. for example, if the first row is an array of numbers from subject 4, they can follow the order: 4(CSE2), 5(CSE1), 3(CSE3), and 4(CSE4). Also, the numbers in the second row does not have to follow the order of the first row. for example, similarly, if the first row is an array of numbers from subject 4 in the order: 4(CSE2), 5(CSE1), 3(CSE3), and 4(CSE4), numbers in the second row (assuming it is from subject 8) does not have to be 6(CSE2), 3(CSE1), 6(CSE3), and 3(CSE4). numbers in these two rows should be drawn without replacement. 2. each of the rest of the rows should include a CSE number in the leftmost cell and three WSE numbers on the right. At the same time, in each row, the three WSE numbers on the right have to be only those numbers that are not corresponding to the CSE number in the leftmost cell. For example, if the CSE number in the leftmost cell is 4, a CSE2 number from subject 6, the three WSE numbers on the right side can only be 4(WSE1), 7(WSE3), and 3(WSE4) from subject 6. 3. the numbers in each row can only be drawn from the same subject. Also, Subjects should be randomized. Specifically, they does have to be in the following order: SubID 1 2 3 4 5 6 7 8 they can be: SubID 2 8 5 4 1 6 7 3 Any ideas? Thanks in advance!! :) -- View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3047885.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.