Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-05-01 Thread Rui Barradas
Hello,


baconbeach wrote
 
 Thanks again for your swift response!!
 
 With your last line, I get 
 
 rowMeans(sapply(stor.confint, colMeans)) 
 2.5 %97.5 % 
 0.3256882 0.4604677 
 
 I need the values (2.5% and 97.5%) for each variable of my model. I don't
 think this what I am getting.
 
 This is what my script looks like now, after your help:
 
 N = length (data_Pb[,1])
 B = 1
 stor.r2 = rep(0,B)
 stor.coeffs - vector(list, B) 
 stor.confint - vector(list, B) 
 
 for (i in 1:B){
 idx = sample(1:N, replace=T)
 newdata = data_Pb[idx,]
 L_NPRI_25k - log(newdata$NPRI_25k+1)
 data_Pb.boot = lm(newdata$Log_Level ~   
 newdata$Ind_5k + newdata$MineP_50k + 
   newdata$NPRI_10k + L_NPRI_25k )
 
 stor.r2[i] = summary(data_Pb.boot)$r.squared
 stor.coeffs [[i]] - coef(data_Pb.boot) 
 stor.confint[[i]] - confint(data_Pb.boot) 
 }
 
 hist(stor.r2, xlab=R-squared,main=Distribution of R-squared - Lead
 (log))
 summary(stor.r2)
 rowMeans(sapply(stor.confint, colMeans)) 
 rowMeans(sapply(stor.coeffs, colMeans))
 
 
 Thanks 
 
 Steeve
 

Ok, my mistake. Bootstrapped values for each coefficient, obviously.
Try, after the loop,

# Transform list into matrix, before applying 'colMeans'
stor.coeffs - do.call(rbind, stor.coeffs)
colMeans(stor.coeffs)

# The same, but in two steps
Q2.5 - lapply(stor.confint, function(x) x[ , 1])
Q2.5 - do.call(rbind, Q2.5)
head(Q2.5)
Q97.5 - lapply(stor.confint, function(x) x[ , 2])
Q97.5 - do.call(rbind, Q97.5)
head(Q97.5)

colMeans(Q2.5)
colMeans(Q97.5)
# Maybe this is better
stor.ci - data.frame(Q2.5=colMeans(Q2.5), Q97.5=colMeans(Q97.5))
rm(Q2.5, Q97.5) # Final clean-up

Maybe you could reuse the name 'stor.confint', it would save some memory.

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4600737.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-05-01 Thread baconbeach
Hi Rui,

You are awesome!!! It worked perfectly!!

Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4601296.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-04-30 Thread baconbeach
Hello,

I am doing a simple linear regression analysis that includes few variables.
I am using a bootstrap analysis to obtain the variation of my variables to
replacement.

I am trying to obtain the coefficients  95% confidence interval from the
bootstrap procedure.

Here is my script for the bootstrap:




N = length (data_Pb[,1])
B = 1


stor.r2 = rep(0,B)
stor.r2 = rep(0,B)
stor.inter = rep(0,B)
stor.Ind5 = rep(0,B)
stor.LNPRI25  = rep(0,B)
stor.NPRI10 = rep(0,B)
stor.Mine  = rep(0,B)

for (i in 1:B){
idx = sample(1:N, replace=T)
newdata = data_Pb[idx,]


L_NPRI_25k - log(newdata$NPRI_25k+1)

data_Pb.boot = lm(newdata$Log_Level ~   
  
  newdata$Ind_5k + newdata$MineP_50k + 
  newdata$NPRI_10k + L_NPRI_25k )

stor.r2[i] = summary(data_Pb.boot)$r.squared
stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1]
stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1]
stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1]
stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1]
stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1]
}


hist(stor.r2, xlab=R-squared,main=Distribution of R-squared - Lead
(log))
hist(stor.inter, xlab=Intercept,main=Distribution of Intercept - Lead
(log))
hist(stor.Ind5, xlab=Industrial 5 km,main=Distribution of Industrial 5 km
- Lead (log))
hist(stor.LNPRI25, xlab=NPRI 25 km (log),main=Distribution of NPRI 25 km
- Lead (log))
hist(stor.NPRI10, xlab=NPRI 10 km,main=Distribution of NPRI 10 km - Lead
(log))
hist(stor.Mine, xlab=Past Mines 50 km,main=Distribution of Past Mines 50
km - Lead (log))


summary(stor.r2)
summary(stor.inter)
summary(stor.Ind5)
summary(stor.LNPRI25)
summary(stor.NPRI10)
summary(stor.Mine)


###Valid only for the last calculated model of the bootstrap analysis ???

confint(data_Pb.boot)


Can anybody show me the best way to get the 95% confidence interval of the
coefficients?

Thank you

Steeve

--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-04-30 Thread baconbeach
Hi Rui,

Thanks for your help!!

It works perfectly, but when I call stor.confint, I obtain the list of
confidence interval (10,000 times).  Is there an easy way to summarize the
results and getting only one 2.5% and 97.5% values for each variable?

Thanks again for your help

Steeve

--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599813.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-04-30 Thread Rui Barradas
Hello,


baconbeach wrote
 
 Hello,
 
 I am doing a simple linear regression analysis that includes few
 variables. I am using a bootstrap analysis to obtain the variation of my
 variables to replacement.
 
 I am trying to obtain the coefficients  95% confidence interval from the
 bootstrap procedure.
 
 Here is my script for the bootstrap:
 
 
 
 
 N = length (data_Pb[,1])
 B = 1
 
 
 stor.r2 = rep(0,B)
 stor.r2 = rep(0,B)
 stor.inter = rep(0,B)
 stor.Ind5 = rep(0,B)
 stor.LNPRI25  = rep(0,B)
 stor.NPRI10 = rep(0,B)
 stor.Mine  = rep(0,B)
 
 for (i in 1:B){
 idx = sample(1:N, replace=T)
 newdata = data_Pb[idx,]
 
 
 L_NPRI_25k - log(newdata$NPRI_25k+1)
 
 data_Pb.boot = lm(newdata$Log_Level ~   
   
   newdata$Ind_5k + newdata$MineP_50k + 
   newdata$NPRI_10k + L_NPRI_25k )
 
 stor.r2[i] = summary(data_Pb.boot)$r.squared
 stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1]
 stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1]
 stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1]
 stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1]
 stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1]
 }
 
 
 hist(stor.r2, xlab=R-squared,main=Distribution of R-squared - Lead
 (log))
 hist(stor.inter, xlab=Intercept,main=Distribution of Intercept - Lead
 (log))
 hist(stor.Ind5, xlab=Industrial 5 km,main=Distribution of Industrial 5
 km - Lead (log))
 hist(stor.LNPRI25, xlab=NPRI 25 km (log),main=Distribution of NPRI 25
 km - Lead (log))
 hist(stor.NPRI10, xlab=NPRI 10 km,main=Distribution of NPRI 10 km -
 Lead (log))
 hist(stor.Mine, xlab=Past Mines 50 km,main=Distribution of Past Mines
 50 km - Lead (log))
 
 
 summary(stor.r2)
 summary(stor.inter)
 summary(stor.Ind5)
 summary(stor.LNPRI25)
 summary(stor.NPRI10)
 summary(stor.Mine)
 
 
 ###Valid only for the last calculated model of the bootstrap analysis ???
 
 confint(data_Pb.boot)
 
 
 Can anybody show me the best way to get the 95% confidence interval of the
 coefficients?
 
 Thank you
 
 Steeve
 

I think you're complicating a bit, you could save r2 in a vector and the
coefficients in a list (see (*) below),
like the one that follows:

After creating your stor vars, and before the bootstarp loop, put this

stor.confint - vector(list, B)

Then, inside the loop, at it's end,

stor.confint[[i]] - confint(data_Pb.boot)

This creates a list of matrices.

(*) The same for the coefficients, create a list first then in the loop use
function coef()

stor.coeffs - vector(list, B)

stor.coeffs - coef(data_Pb.boot)


Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599734.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-04-30 Thread Rui Barradas
Hello,


baconbeach wrote
 
 Hi Rui,
 
 Thanks for your help!!
 
 It works perfectly, but when I call stor.confint, I obtain the list of
 confidence interval (10,000 times).  Is there an easy way to summarize the
 results and getting only one 2.5% and 97.5% values for each variable?
 
 Thanks again for your help
 
 Steeve
 

Yes, there is.

rowMeans(sapply(stor.confint, colMeans))

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599823.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 95% confidence interval of the coefficients from a bootstrap analysis

2012-04-30 Thread baconbeach
Thanks again for your swift response!!

With your last line, I get 

 rowMeans(sapply(stor.confint, colMeans)) 
2.5 %97.5 % 
0.3256882 0.4604677 

I need the values (2.5% and 97.5%) for each variable of my model. I don't
think this what I am getting.

This is what my script looks like now, after your help:

N = length (data_Pb[,1])
B = 1
stor.r2 = rep(0,B)
stor.coeffs - vector(list, B) 
stor.confint - vector(list, B) 

for (i in 1:B){
idx = sample(1:N, replace=T)
newdata = data_Pb[idx,]
L_NPRI_25k - log(newdata$NPRI_25k+1)
data_Pb.boot = lm(newdata$Log_Level ~   
newdata$Ind_5k + newdata$MineP_50k + 
  newdata$NPRI_10k + L_NPRI_25k )

stor.r2[i] = summary(data_Pb.boot)$r.squared
stor.coeffs [[i]] - coef(data_Pb.boot) 
stor.confint[[i]] - confint(data_Pb.boot) 
}

hist(stor.r2, xlab=R-squared,main=Distribution of R-squared - Lead
(log))
summary(stor.r2)
rowMeans(sapply(stor.confint, colMeans)) 
rowMeans(sapply(stor.coeffs, colMeans))


Thanks 

Steeve

--
View this message in context: 
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p454.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.