It took me a little while to figure this out, but: the problem is that if your resampling leaves out any countries (which is very likely), your model applied to the bootstrapped data will have fewer coefficients than your original model.

I tried this:

cc <- unique(e$Country)
func <- function(data, idx) {
   coef(lm(Score~ Time + factor(Country, levels =cc),data=data[idx,]))
}

but lm() automatically drops coefficients for missing countries (I didn't think about it too hard, but I thought they might get returned as NA and that boot() might be able to handle that ...)

If you want to do this I think you'll have to find a way to do a *stratified* bootstrap, restricting the bootstrap samples so that they always contain at least one sample from each country ... (I would have expected "strata = as.numeric(e$Country)" to do this, but it doesn't work the way I thought ... it tries to compute the statistics for *each* stratum ...)



====

 Debugging attempts:

set.seed(101)
options(error=recover)
B= boot(e, func, R=1000)


Error in t.star[r, ] <- res[[r]] :
  number of items to replace is not a multiple of replacement length

Enter a frame number, or 0 to exit

1: boot(e, func, R = 1000)

<enter 1>

Selection: 1
Called from: top level
Browse[1]> print(r)
[1] 2
Browse[1]> t.star[r,]
[1] NA NA NA NA NA NA NA NA NA

i[2,]
[1] 14 15 22 22 21 14 8 2 12 22 10 15 9 7 9 13 12 23 1 20 15 7 5 10




On 2024-01-13 5:22 p.m., varin sacha via R-help wrote:
Dear Duncan,
Dear Ivan,

I really thank you a lot for your response.
So, if I correctly understand your answers the problem is coming from this line:

coef(lm(Score~ Time + factor(Country)),data=data[idx,])

This line should be:
coef(lm(Score~ Time + factor(Country),data=data[idx,]))

If yes, now I get an error message (code here below)! So, it still does not 
work.

Error in t.star[r, ] <- res[[r]] :
   number of items to replace is not a multiple of replacement length


##########################################
Score=c(345,564,467,675,432,346,476,512,567,543,234,435,654,411,356,658,432,345,432,345,
 345,456,543,501)
Country=c("Italy", "Italy", "Italy", "Turkey", "Turkey", "Turkey", "USA", "USA", "USA", "Korea", "Korea", "Korea", "Portugal", "Portugal", "Portugal", "UK", "UK", "UK", "Poland", "Poland", "Poland", "Austria", "Austria", "Austria") Time=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) e=data.frame(Score, Country, Time) library(boot)
func= function(data, idx) {
coef(lm(Score~ Time + factor(Country),data=data[idx,]))
}
B= boot(e, func, R=1000)
boot.ci(B, index=2, type="perc")
#############################################################








Le samedi 13 janvier 2024 à 21:56:58 UTC+1, Ivan Krylov <ikry...@disroot.org> a 
écrit :





В Sat, 13 Jan 2024 20:33:47 +0000 (UTC)

varin sacha via R-help <r-help@r-project.org> пишет:

coef(lm(Score~ Time + factor(Country)),data=data[idx,])


Wrong place for the data=... argument. You meant to give it to lm(...),
but in the end it went to coef(...). Without the data=... argument, the
formula passed to lm() picks up the global variables inherited by the
func() closure.

Unfortunately, S3 methods really do have to ignore extra arguments they
don't understand if the class is to be extended, so coef.lm isn't
allowed to complain to you about it.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to