On Tue, 2006-12-05 at 14:41 -0500, Daniel Lee Rabosky wrote: > Hi > > I have a question about the creation of variables within lists in R. I am > running simulations and am interested in two parameters, ESM and ESMM (the > similarity of these names is important for my question). I do simulations > to generate ESMM, then plug these values into a second simulation function > to get ESM: > > x <- list() > > for (i in 1:nsimulations) > { > x$ESMM[i] <- do_simulation1() > x$ESM[i] <- do_simulation2(x$ESMM[i]) > } > > and I return everything as a dataframe, x <- as.data.frame(x) > > When I do this, I find that x$ESMM is overwritten by x$ESM for the first > simulation. However, x$ESM is nonetheless correctly generated using > x$ESMM. > > Thus, x$ESM[1] = x$ESMM[1], but for the other n-thousand simulations, > ESMM is not overwritten; the error only occurs on the first instance of > ESM. > > I think I know why this is occurring: I am creating a new variable in a > list and assigning it a value, but when R can’t find the variable, it > overwrites the next most similar variable (ESMM). But it still proceeds > to create the new variable ESM, having overwritten x$ESMM[1]. And it > doesn’t happen for subsequent simulations, because both variables then > exist in the list. > > My questions are: > 1) how different do variable names have to be to avoid this problem? What > exactly is R using to decide that ESMM is the same as ESM? > > or > > 2) is there something fundamentally flawed with the manner in which I > dynamically create variables in lists, without initializing them in some > fashion? This approach worked fine until I noticed this issue with > variables having similar names. > > Thanks very much in advance for your help. > > Dan Rabosky
This has to do with partial matching to index data frame columns and list elements. It is the default behavior in R and if you search the archives using: RSiteSearch("partial matching") you will note prior discussions on this. A simple example: > x <- list() > x list() > x$ESMM[1] <- 1 > x $ESMM [1] 1 > x$ESM[1] <- 2 > x $ESMM [1] 2 $ESM [1] 2 Both values are changed, since x$ESM does not yet exist and the assignment partially matches x$ESMM. Then x$ESM is created. I think that in this particular situation, you might want to try: # Create a simple function that returns pairs of random samples from # 'letters', which is a:z Sim <- function() { list(ESMM = letters[sample(26, 1)], ESM = letters[sample(26, 1)]) } # Run it once > Sim() $ESMM [1] "l" $ESM [1] "z" Now use replicate() to do this 10 times. Note the default behavior is to simplify the returned values into a matrix. > x <- replicate(10, Sim()) > x [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] ESMM "x" "q" "c" "f" "e" "f" "y" "d" "z" "h" ESM "u" "c" "j" "v" "u" "j" "o" "p" "g" "g" So, in your case create a function Sim() like this: Sim <- function() { ESMM <- do_simulation1() ESM <- do_simulation2(ESMM) list(ESMM = ESMM, ESM = ESM) } and then use replicate() as above. See ?replicate for more information. HTH, Marc Schwartz ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.