Re: [R] bootstrapping respecting subject level information

David Winsemius Thu, 04 Jul 2013 16:40:37 -0700

On Jul 3, 2013, at 7:19 PM, Sol Lago wrote:

> Hi there,
> 
> This is the first time I use this forum, and I want to say from the start I 
> am not a skilled programmer. So please let me know if the question or code 
> were unclear!
> 
> I am trying to bootstrap an interaction (that is my test statistic) using the 
> package "boot". My problem is that for every resample, I would like the 
> randomization to be done within subjects, so that observations from different 
> subjects are not mixed. Here is the code to generate a dataframe similar to 
> mine:
> 
> Subject = rep(c("S1","S2","S3","S4"),4)
> Num     = rep(c("singular","plural"),8) 
> Gram    = rep(c("gram","gram","ungram","ungram"),4)
> RT      = c(657,775,678,895,887,235,645,916,930,768,890,1016,590,978,450,920)
> data    = data.frame(Subject,Num,Gram,RT) 
> 
> This is the code I used to get the empirical interaction value:
> 
> summary(lm(RT ~ Num*Gram, data=data))
> 
> As you can see, the interaction between my two factors is -348.


That depends on what you mean by "the interaction between my two factors". It 
is almost never a good idea to attempt interpretation of interaction 
coefficients, and is always preferable to check the predictions of hte model.

> I want to get a bootstrap confidence interval for this statistic, which I can 
> generate using the "boot" package:
> 
> #Function to create the statistic to be boostrapped
> boot.huber <- function(data, indices) {
> data <- data[indices, ] #select obs. in bootstrap sample
> mod <- lm(RT ~ Num*Gram, data=data)
> coefficients(mod)       #return coefficient vector
> }
> 
> #Generate bootstrap estimate
> data.boot <- boot(data, boot.huber, 1999)
> 
> #Get confidence interval
> boot.ci(data.boot, index=4, type=c("norm", "perc", "bca"),conf=0.95) #4 gets 
> the CI for the interaction
> 
> My problem is that I think the resamples should be generated without mixing 
> the individual subjects observations: that is, to generate the new resamples, 
> the observations from subject 1 (S1) should be shuffled within subject 1,

What does that mean?

> not mixing them with the observations from subjects 2, etc... I don't know 
> how "boot" is doing the resampling (I read the documentation but don't 
> understand how the function is doing it)

It's doing it by selecting randomly entire rows. It is not "shuffling within 
rows" for selected subjects.
> 
> Does anyone know how I could make sure that the resampling procedure used by 
> "boot" respects the subject level information?

It would be doing so because that is the way you set up the indexing. The 
column ordering tof the data within subjects is not permuted.

I do think you are beyond your understanding of the statstical principles that 
you are attempting to use and would be safer to consult with a statsitician.


-- 
David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bootstrapping respecting subject level information

Reply via email to