Dear Jan,
Thanks for your reply.
The first solution works well for my needs for now, but I have a
question about the second. If I run your code and then call the
function:
generate_unit(10)
I get an error that
Error in unit$size : $ operator is invalid for atomic vectors
Did you experience the same thing?
In any case, I will definitely take a look at the plyr package,
which I'm sure will be useful in the future.
Thanks again!
Emma
----- Original Message -----
From: Jan van der Laan <rh...@eoos.dds.nl>
To: "r-help@r-project.org" <r-help@r-project.org>
Cc: Emma Thomas <thomas...@yahoo.com>
Sent: Wednesday, December 14, 2011 6:18 AM
Subject: Re: [R] Generating input population for microsimulation
Emma,
If, as you say, each unit is the same you can just repeat the units
to obtain the required number of units. For example,
unit_size <- 10
n_units <- 10
unit_id <- rep(1:n_units, each=unit_size)
pid <- rep(1:unit_size, n_units)
senior <- ifelse(pid <= 2, 1, 0)
pop <- data.frame(unit_id, pid, senior)
If you want more flexibility in generating the units, I would first
generate the units (without the persons) and then generate the
persons for each unit. In the example below I use the plyr package;
you could probably also use lapply/sapply, or simply a loop over the
units.
library(plyr)
generate_unit <- function(unit) {
pid <- 1:unit$size
senior <- rep(0, unit$size)
senior[sample(unit$size, 2)] <- 1
return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
}
units <- data.frame(id=1:n_units, size=unit_size)
library(plyr)
ddply(units, .(id), generate_unit)
HTH,
Jan
Emma Thomas <thomas...@yahoo.com> schreef:
Hi all,
I've been struggling with some code and was wondering if you all could help.
I am trying to generate a theoretical population of P people who
are housed within X different units. Each unit follows the same
structure- 10 people per unit, 8 of whom are junior and two of whom
are senior. I'd like to create a unit ID and a unique identifier
for each person (person ID, PID) in the population so that I have a
matrix that looks like:
unit_id pid senior
[1,] 1 1 0
[2,] 1 2 0
[3,] 1 3 0
[4,] 1 4 0
[5,] 1 5 0
[6,] 1 6 0
[7,] 1 7 0
[8,] 1 8 0
[9,] 1 9 1
[10,] 1 10 1
...
I came up with the following code, but am having some trouble
getting it to populate my matrix the way I'd like.
world <- function(units, pop_size, unit_size){
pid <- rep(0,pop_size) #person ID
senior <- rep(0,pop_size) #senior in charge
unit_id <- rep(0,pop_size) #unit ID
for (i in 1:pop_size){
for (f in 1:units) {
senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
pid[i] = sample(c(1:10), 1, replace = FALSE)
unit_id[i] <- f
}}
data <- cbind(unit_id, pid, senior)
return(data)
}
world(units = 10,pop_size = 100, unit_size = 10) #call the function
The output looks like:
unit_id pid senior
[1,] 10 7 0
[2,] 10 4 0
[3,] 10 10 0
[4,] 10 9 1
[5,] 10 10 0
[6,] 10 1 1
...
but what I really want is to generate is 10 different units with
two seniors per unit, and with each person in the population having
a unique identifier.
I thought a nested for loop was one way to go about creating my
data set of people and families, but obviously I'm doing something
(or many things) wrong. Any suggestions on how to fix this? I had
been focusing on creating a person and assigning them to a unit,
but perhaps I should create the units and then populate the units
with people?
Thanks so much in advance.
Emma
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.