Re: [R] New Sampling question

2010-11-18 Thread Ista Zahn
This can of course be done, but before I make any attempt to do it I
have to ask: why do you want this?

On Wed, Nov 17, 2010 at 7:08 PM, wangwallace talentt...@gmail.com wrote:

 I have another question about drawing samples from a data frame. This might
 sound really tricky. Let me use a data frame I have posted earlier as an
 example:

    SubID    CSE1 CSE2 CSE3 CSE4 WSE1 WSE2 WSE3 WSE4
      1          6      5       6       2      6      2        2       4
      2          6      4       7       2      6      6        2       3
      3          5      5       5       5      5      5        4       5
      4          5      4       3       4      4      4        5       2
      5          5      6       7       5      6      4        4       1
      6          5      4       3       6      4      3        7       3
      7          3      6       6       3      6      5        2       1
      8          3      6       6       3      6      5        4       7

 this data frame have two sets of variables. each set simply represent one
 scale. as shown above, the first scale, say CSE, consists of four items:
 CSE1, CSE2, CSE3, and CSE4, whereas the second scale, say WSE, also has four
 items: WSE1, WSE2, WSE3, WSE4.
 the leftmost column lists the subjects' ID.

 I wanna create a new data frame through sampling random numbers from the
 data frame above. Below is the structure of the new data frame.

    SubID    var    var   var     var
      s          c      c      c       c
      s          c      c      c       c
      s          c      w     w       w
      s          c      w     w       w
      s          c      w     w       w
      s          c      w     w       w
      s          c      w     w       w
      s          c      w     w       w

 in the new data frame:

 s= SubID range from 1 to 8
 var= variables
 c=CSE numbers
 w=WSE numbers

 some rules to construct the new data frame:

 1. the top two rows have to be filled with CSE numbers; the numbers in the
 cells of each row should be randomized. for example, if the first row is an
 array of numbers from subject 4, they can follow the order: 4(CSE2),
 5(CSE1), 3(CSE3), and 4(CSE4). Also, the numbers in the second row does not
 have to follow the order of the first row. for example, similarly, if the
 first row is an array of numbers from subject 4 in the order: 4(CSE2),
 5(CSE1), 3(CSE3), and 4(CSE4), numbers in the second row (assuming it is
 from subject 8) does not have to be 6(CSE2), 3(CSE1), 6(CSE3), and 3(CSE4).
 numbers in these two rows should be drawn without replacement.

 2. each of the rest of the rows should include a CSE number in the leftmost
 cell and three WSE numbers on the right. At the same time, in each row, the
 three WSE numbers on the right have to be only those numbers that are not
 corresponding to the CSE number in the leftmost cell. For example, if the
 CSE number in the leftmost cell is 4, a CSE2 number from subject 6, the
 three WSE numbers on the right side can only be 4(WSE1), 7(WSE3), and
 3(WSE4) from subject 6.

 3. the numbers in each row can only be drawn from the same subject. Also,
 Subjects should be randomized. Specifically, they does have to be in the
 following order:

  SubID
      1
      2
      3
      4
      5
      6
      7
      8

 they can be:

  SubID
      2
      8
      5
      4
      1
      6
      7
      3

 Any ideas?  Thanks in advance!! :)
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3047885.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread wangwallace

Dear Ista Zahn-2,

If you can give me some advice, I really appreciate it. I have been working
on it for days. it seems hard for some novice of R like me to write flexible
functions myself. 

This is for my dissertation. CSE and WSE are two scales of the same
construct. The sampling strategy I wanted above allows me check how the
items of these two scales vary within person and across person. 

Also, I added another rule: draw 1000 random samples...

Again, Thanks!

Wallace
-- 
View this message in context: 
http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048948.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread Mike Rennie
Hi Wallace,

Have you tried playing with sample()? Note that you can apply this function
both to whole dataframes, as well as specific items within a vector. If you
play with applying the function to different ways of indexing your sample
data, you will likely arrive at your solution.

for example:

a-data.frame(c(1:10),c(21:30))

sample(a[,2],2) #randomly draw two numbers from a column

sample(a[2,],2) #randomly draw two numbers from a row

a[sample(nrow(a),5,replace=T),] #ranomly draw four whole rows with
replacement

You may also find subet() helpful, based on your description.

HTH,

Mike

On Thu, Nov 18, 2010 at 9:43 AM, wangwallace talentt...@gmail.com wrote:


 Dear Ista Zahn-2,

 If you can give me some advice, I really appreciate it. I have been working
 on it for days. it seems hard for some novice of R like me to write
 flexible
 functions myself.

 This is for my dissertation. CSE and WSE are two scales of the same
 construct. The sampling strategy I wanted above allows me check how the
 items of these two scales vary within person and across person.

 Also, I added another rule: draw 1000 random samples...

 Again, Thanks!

 Wallace
 --
 View this message in context:
 http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048948.html
  Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread wangwallace

Also, I need some function at the end which would enable me to draw 1000 such
random samples. thanks! :)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread Mike Rennie
You could try writing a loop


a-data.frame(c(1:10),c(21:30))

M-10 #number of iterations- scale up to 1000 once you get your sampling
function working
res-NULL #place to store your results
for i in (1:M)
 {
 ares-sample(a[,2],1)
 res-c(res, ares)
 }
res

It's up to you how to store your results- you can do it as a list if you
want, then you can get at each of your 1000 results. I've provided a simple
example where you just add each result to the end of your vector.

Note also that someone will also take issue with the way I'm assigning
results in the loop- I know I've seen it written elsewhere that this is not
a very elegant way of approaching the problem (particularly in terms of
efficiency), but it will work. I just can't recall the other way of doing
things off the top of my head- if someone else would like to chime in, be my
guest.

Mike



On Thu, Nov 18, 2010 at 9:46 AM, wangwallace talentt...@gmail.com wrote:


 Also, I need some function at the end which would enable me to draw 1000
 such
 random samples. thanks! :)
 --
 View this message in context:
 http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3048958.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread wangwallace

Hi, Mike,

thank you very much!!:)

the following two functions are really helpful, which I didn't even know.
Actually, I searched the forum for something like this, but failed. Now I am
still trying to make up my own functions. :)

sample(a[,2],2) #randomly draw two numbers from a column

sample(a[2,],2) #randomly draw two numbers from a row 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3049083.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New Sampling question

2010-11-18 Thread wangwallace


I spent the whole afternoon on it, but there is still no progress. I wish I
could take some courses... :(
-- 
View this message in context: 
http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3049796.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New Sampling question

2010-11-17 Thread wangwallace

I have another question about drawing samples from a data frame. This might
sound really tricky. Let me use a data frame I have posted earlier as an
example:

SubIDCSE1 CSE2 CSE3 CSE4 WSE1 WSE2 WSE3 WSE4
  1  6  5   6   2  6  22   4
  2  6  4   7   2  6  62   3
  3  5  5   5   5  5  54   5
  4  5  4   3   4  4  45   2
  5  5  6   7   5  6  44   1
  6  5  4   3   6  4  37   3
  7  3  6   6   3  6  52   1
  8  3  6   6   3  6  54   7 

this data frame have two sets of variables. each set simply represent one
scale. as shown above, the first scale, say CSE, consists of four items:
CSE1, CSE2, CSE3, and CSE4, whereas the second scale, say WSE, also has four
items: WSE1, WSE2, WSE3, WSE4.
the leftmost column lists the subjects' ID. 

I wanna create a new data frame through sampling random numbers from the
data frame above. Below is the structure of the new data frame.

SubIDvarvar   var var 
  s  c  c  c   c  
  s  c  c  c   c  
  s  c  w w   w  
  s  c  w w   w  
  s  c  w w   w
  s  c  w w   w
  s  c  w w   w
  s  c  w w   w

in the new data frame:
 
s= SubID range from 1 to 8
var= variables
c=CSE numbers
w=WSE numbers

some rules to construct the new data frame:

1. the top two rows have to be filled with CSE numbers; the numbers in the
cells of each row should be randomized. for example, if the first row is an
array of numbers from subject 4, they can follow the order: 4(CSE2),
5(CSE1), 3(CSE3), and 4(CSE4). Also, the numbers in the second row does not
have to follow the order of the first row. for example, similarly, if the
first row is an array of numbers from subject 4 in the order: 4(CSE2),
5(CSE1), 3(CSE3), and 4(CSE4), numbers in the second row (assuming it is
from subject 8) does not have to be 6(CSE2), 3(CSE1), 6(CSE3), and 3(CSE4).
numbers in these two rows should be drawn without replacement.

2. each of the rest of the rows should include a CSE number in the leftmost
cell and three WSE numbers on the right. At the same time, in each row, the
three WSE numbers on the right have to be only those numbers that are not
corresponding to the CSE number in the leftmost cell. For example, if the
CSE number in the leftmost cell is 4, a CSE2 number from subject 6, the
three WSE numbers on the right side can only be 4(WSE1), 7(WSE3), and
3(WSE4) from subject 6. 

3. the numbers in each row can only be drawn from the same subject. Also,
Subjects should be randomized. Specifically, they does have to be in the
following order:

 SubID
  1 
  2  
  3
  4  
  5  
  6  
  7  
  8
  
they can be:

 SubID
  2 
  8  
  5
  4  
  1  
  6  
  7  
  3

Any ideas?  Thanks in advance!! :)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3047885.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.