Re: [R] Plotting confidence bands around regression line

2011-02-10 Thread dtholmes

Here is my primitive and computationally intensive solution to this problem.
Thank you to Michal (see above) for making me aware of the nice polygon
function. I am sure that there are better solutions but this one works for
me at work and for publication.



##
#Passing Bablok Regression
#Bootstrapped and directly calculated CIs
#Written by Dan Holmes, Department of Pathology and Lab Medicine
#St. Paul's Hospital, Vancouver, BC
#Not gauranteed to produce correct results
#Use at your own risk ;)
##

library(boot)

isodd - function(N){
if (N%%2==0) {
ans-FALSE
} else if (N%%2==1) {
ans-TRUE 
} else {
ans-Neither!
  }
return(ans)
}



PB.reg-function(data,alpha,indices){
#The data is selected by the set of indices that is passed
d-data[indices,]
#The data for the PB.reg function is then taken from the resampled set, d
x-d[,1]
y-d[,2]
#Then we proceed as usual
lx-length(x)
l-choose(lx,2)
S-matrix(1:l,nrow=1,ncol=l)
for (i in 1:(lx-1)) {
for (j in (i+1):lx) {
# This expression generates the natural numbers from i and j
index-(j-i)+(i-1)*(lx-i/2)
S[index]-(y[i]-y[j])/(x[i]-x[j])
}
}
S.sort-sort(S)
S.sort-subset(S.sort,S.sort!=is.na(S.sort))
N-length(S.sort)
neg-length(subset(S.sort,S.sort0))
K-floor(neg/2)
#Determine the slope b and the intercept a
if (isodd(N)) {
b-S.sort[(N+1)/2+neg/2]
} else {
b-sqrt(S.sort[N/2+K]*S.sort[N/2+K+1])
}
a-median(y-b*x)
#CI of b by Passing and Bablok's method
C.gamma-qnorm(1-alpha/2)*sqrt(lx*(lx-1)*(2*lx+5)/18)
M1-round((N-C.gamma)/2)
M2-N-M1+1
b.lower-S.sort[M1+K]
b.upper-S.sort[M2+K]
#CI of a by Passing and Bablok's method
a.lower-median(y-b.upper*x)
a.upper-median(y-b.lower*x)
return(as.vector(c(a,b,a.lower,a.upper,b.lower,b.upper)))
}



#Determine bootstrapped CIs for the intercept and slope
PB.boot-function(data,R,n.fit,alpha){
boot.results-boot(data=data, alpha=alpha, statistic=PB.reg, R=R)
a.ci-boot.ci(boot.results, type=bca,index=1) # intercept
b.ci-boot.ci(boot.results, type=bca,index=2) # slope
a.vector-boot.results$t[,1]
b.vector-boot.results$t[,2]
x.points-seq(min(data$x)-0.1*(max(data$x)-min(data$x)),max(data$x)+0.1*(max(data$x)-min(data$x)),length.out=n.fit)
y.points-array(dim=R)
reg.CI.data-matrix(data=NA,nrow=n.fit,ncol=3)
for (i in 1:n.fit){
for (j in 1:R){
#Determine the intersections of all bootstrapped regression lines with
vertical line x=a   
y.points[j]-a.vector[j]+b.vector[j]*x.points[i]
}
#Determine the central (1-alpha)/2 quantiles
lower.y-quantile(y.points,probs=alpha/2)
upper.y-quantile(y.points,probs=(1-alpha/2))
reg.CI.data[i,1:3]-c(x.points[i],lower.y,upper.y)
}
return(list(reg.CI.data=reg.CI.data,a.ci=a.ci,b.ci=b.ci))
}



#Draw the regression and the Bland Altman plot
plotaroo-function(data,xlab,ylab,R,n.fit,alpha){
reg-PB.reg(data,alpha)
boot-PB.boot(data,R,n.fit,alpha)
reg.CI.data-boot$reg.CI.data
a.ci-boot$a.ci
b.ci-boot$b.ci
x-data[,1]
y-data[,2]
par(mfrow=c(1,2))
plot(x,y,main=Passing Bablok Regression,xlab=xlab,ylab=ylab)
#Colour in the confidence band with polygons.
for (i in 1:n.fit){
if (in.fit) {

vx-c(reg.CI.data[i,1],reg.CI.data[i,1],reg.CI.data[i+1,1],reg.CI.data[i+1,1])

vy-c(reg.CI.data[i,2],reg.CI.data[i,3],reg.CI.data[i+1,3],reg.CI.data[i+1,2])
polygon(vx,vy,fillOddEven=T,col=gray,border=gray)
}
}
#Replot the points 'cause the nasty polygons erased your points
points(x,y,pch=21,bg=gray)
#Plot the regression plot
abline(reg[1],reg[2],col=blue)
#Put regression info in the top left hand corner
correl-paste(R = ,round(cor.test(x,y)$estimate,3),sep=)
slope-paste(Slope = ,round(reg[2],2),
[,round(reg[5],2),,,round(reg[6],2),],sep=)
intercept-paste(Intercept = ,round(reg[1],2),
[,round(reg[3],2),,,round(reg[4],2),],sep=)
text-c(correl,slope,intercept)
legend(topleft,text,title=Regression Data,inset = .05)
lines(reg.CI.data[,1],reg.CI.data[,2],col=black,lwd=2,lty=3)
lines(reg.CI.data[,1],reg.CI.data[,3],col=black,lwd=2,lty=3)
#Plot the line of identity
abline(0,1,lty=3)
#Plot the Bland Altman Plot
diff-(y-x)
avg-(x+y)/2
plot(avg,diff,pch=21,bg=gray,ylim=c(-max(abs(diff)),max(abs(diff))),main=Bland
Altman Plot,xlab=Average,ylab=Difference)
abline(h=mean(diff),col=blue)
abline(h=c(-2,2)*sd(diff)+mean(diff),col=black,lty=2)
abline(h=0)
par(mfrow=c(1,1))
#Spit out the results in a readable format
reg.list-list(intercept=reg[1],slope=reg[2],CI.intercept=c(reg[3:4]),CI.slope=c(reg[5:6]),CI.intercept.bootstrap=a.ci$bca[4:5],CI.slope.bootstrap=b.ci$bca[4:5])
return(reg.list)
}

#Example application to mock data
x-seq(from=1,to=30,length.out=30)
y-1.15*x-5+rnorm(30,0,0.25*x)
data-as.data.frame(cbind(x,y))

x11(width=14)
#R is the number of bootstrap resampling events
#n.fit is the number of points with which to build the regressions
confidence band

Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread peter dalgaard

On Aug 10, 2010, at 8:23 PM, Michal Figurski wrote:

 Peter,
 
 Since in PB the procedure is to calculate a whole list of slopes and 
 intercepts, wouldn't it be a solution to determine the correlation and go 
 from there? How do I do it?

You can't. (Perhaps someone could, but it looks like a research level project.)

As it may have transpired, I trust the whole PB procedure about as much as 
Reagan trusted the Russians, but if someone insisted on having it implemented, 
I'd consider selecting say 5 points along the range of x, say a1,a2,...,a5, 
look at the CI's for the intercept when applied to (x-a1,y) and interpolate 
between them.

 
 --
 Michal J. Figurski, PhD
 HUP, Pathology  Laboratory Medicine
 Biomarker Research Laboratory
 3400 Spruce St. 7 Maloney
 Philadelphia, PA 19104
 tel. (215) 662-3413
 
 On 2010-08-10 13:12, Peter Dalgaard wrote:
 Michal Figurski wrote:
 
 # And the result of the Passing-Bablok regression on this data frame:
 Estimate  5%CI 95%CI
 Intercept -4.306197 -9.948438 -1.374663
 Slope  1.257584  1.052696  1.679290
 
 The original Passing  Bablok article on this method has an easy
 prescription for CIs on coefficients, so I implemented that. Now I need
 a way to calculate CI boundaries for individual points - this may be a
 basic handbook stuff - I just don't know it (I'm not a statistician).
 
 The answer is that you can't. You can't even do it with ordinary linear
 regression without knowing the correlation between slope and intercept.
 However, if you can get a CI for the intercept then you could subtract
 x0 from all the x and get a CI for the value at x0.
 
 (This brings echos from a distant past. My master's thesis was about
 some similar median-type estimators. I can't remember whether I looked
 at the Passing-Bablok paper at the time (1985!!) but my general
 recollection is that this group of methods is littered with unstated
 assumptions.)
 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread Michal Figurski

Peter, Frank, David and others,

Thank you all for your ideas. I understand your lack of trust in PB 
method. Setting that aside (it's beyond me anyways), please see below 
what I have finally came up with to calculate the CI boundaries. Given 
the slope and intercept with their 05%  95% CIs, and a range of x = 1:50 :


   ints= c(-1, 0, 1)  # c(int05%, int, int95%)
   slos= c(0.9, 1, 1.1)   # c(slo05%, slo, slo95%)
   CIs = data.frame(x=1:50, lo=NA, hi=NA)
   for (i in 1:50) {
   CIs$lo[i] = min(ints + slos * CIs$x[i])
   CIs$hi[i] = max(ints + slos * CIs$x[i])
   }

It looks like it works to me. Does it make sense?

Now, what about a 4-parameter 'nls' model? Do you guys think I could use 
the same approach?


Best regards,

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 13:12, Peter Dalgaard wrote:

Michal Figurski wrote:


# And the result of the Passing-Bablok regression on this data frame:
 Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician).


The answer is that you can't. You can't even do it with ordinary linear
regression without knowing the correlation between slope and intercept.
However, if you can get a CI for the intercept then you could subtract
x0 from all the x and get a CI for the value at x0.

(This brings echos from a distant past. My master's thesis was about
some similar median-type estimators. I can't remember whether I looked
at the Passing-Bablok paper at the time (1985!!) but my general
recollection is that this group of methods is littered with unstated
assumptions.)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread Frank Harrell



Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Wed, 11 Aug 2010, Michal Figurski wrote:


Peter, Frank, David and others,

Thank you all for your ideas. I understand your lack of trust in PB
method. Setting that aside (it's beyond me anyways), please see below
what I have finally came up with to calculate the CI boundaries. Given
the slope and intercept with their 05%  95% CIs, and a range of x = 1:50 :

   ints= c(-1, 0, 1)  # c(int05%, int, int95%)
   slos= c(0.9, 1, 1.1)   # c(slo05%, slo, slo95%)
   CIs = data.frame(x=1:50, lo=NA, hi=NA)
   for (i in 1:50) {
   CIs$lo[i] = min(ints + slos * CIs$x[i])
   CIs$hi[i] = max(ints + slos * CIs$x[i])
   }

It looks like it works to me. Does it make sense?


Doesn't look like it takes the correlation of slope and intercept into 
account but I may not understand.




Now, what about a 4-parameter 'nls' model? Do you guys think I could use
the same approach?


This problem seems to cry out for one of the many available robust 
regression methods in R.


Frank



Best regards,

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 13:12, Peter Dalgaard wrote:

Michal Figurski wrote:


# And the result of the Passing-Bablok regression on this data frame:
 Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician).


The answer is that you can't. You can't even do it with ordinary linear
regression without knowing the correlation between slope and intercept.
However, if you can get a CI for the intercept then you could subtract
x0 from all the x and get a CI for the value at x0.

(This brings echos from a distant past. My master's thesis was about
some similar median-type estimators. I can't remember whether I looked
at the Passing-Bablok paper at the time (1985!!) but my general
recollection is that this group of methods is littered with unstated
assumptions.)





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread S Ellison


 Frank Harrell f.harr...@vanderbilt.edu 11/08/2010 17:02:03 

 This problem seems to cry out for one of the many available robust 
 regression methods in R.

Not sure that would be much more appropriate, although it would
_appear_ to work. The PB method is a sort of nonparametric/robust
approach to an errors-in-variables problem, intended to provide an
indication of consistency of results between two different measurement
methods, often with similar error variance. So the aim is to handle the
error-in-variable problem at least consistently, to avoid the bias that
results from assuming no error in predictors. The M-estimator and
related robust regression methods in things like MASS and robustbase
don't handle errors in the predictors. Of course, with small errors in
predictors that won't matter much; rlm and the like will be pretty much
as defensible then as they ever are.

But perhaps one could construct a more formal robust equivalent of
error-in-variable regression by using a max likelihood functional
relationship model with bivariate t (choosing arbitrarily low df)
instead of bivariate gaussian errors? Unfortunately I haven't tried
that, so no help beyond the thought ...

Steve Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread Frank Harrell


I may be missing something but I don't see how PB handles errors in 
variables any differently than other regression methods that ignore 
this problem.


Frank

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Wed, 11 Aug 2010, S Ellison wrote:





Frank Harrell f.harr...@vanderbilt.edu 11/08/2010 17:02:03 



This problem seems to cry out for one of the many available robust
regression methods in R.


Not sure that would be much more appropriate, although it would
_appear_ to work. The PB method is a sort of nonparametric/robust
approach to an errors-in-variables problem, intended to provide an
indication of consistency of results between two different measurement
methods, often with similar error variance. So the aim is to handle the
error-in-variable problem at least consistently, to avoid the bias that
results from assuming no error in predictors. The M-estimator and
related robust regression methods in things like MASS and robustbase
don't handle errors in the predictors. Of course, with small errors in
predictors that won't matter much; rlm and the like will be pretty much
as defensible then as they ever are.

But perhaps one could construct a more formal robust equivalent of
error-in-variable regression by using a max likelihood functional
relationship model with bivariate t (choosing arbitrarily low df)
instead of bivariate gaussian errors? Unfortunately I haven't tried
that, so no help beyond the thought ...

Steve Ellison

***
This email and any attachments are confidential. Any u...{{dropped:9}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread David Winsemius


On Aug 10, 2010, at 10:56 AM, Michal Figurski wrote:


Dear R-helpers and graphics gurus,

I have two problems with plotting confidence bands:

1. First is relatively simple. I am using the Passing-Bablok  
procedure to obtain unbiased regression coefficients. This  
procedure yields the a  b coefficient values along with their  
confidence intervals. I then plot the raw data with the regression  
line, but I would like to add the confidence band for the line...  
and I can't figure out how to do it.


In other words, given:

  Estimate  5%CI 95%CI
Intercept -4.305562 -9.931152 -1.381792
Slope  1.257318  1.053025  1.678516

How to plot the regression line with confidence band?


Take a look at plotCI in either gplots or plotrix packages.

Harrell's rms/Hmisc packages are nicely integrated with lattice and  
encourage you to create effective displays of models that remove  
simplistic linearity assumptions.


--
David.



2. Second problem is plotting confidence bands along fitted nls  
regression line. I tried predict(nls.object, int='c') - but  
doesn't work. Later I figured in the documentation that the 'int'  
parameter is currently ignored. I guess this means it's not a  
trivial thing to do. Does anyone have a suggestion on how to obtain  
confidence predictions for such a model?


Please cc my email address when you reply. Thanks and best regards,

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Michal Figurski

David,

I may have stated my problem incorrectly - my problem is to *obtain the 
coordinates* for confidence boundary lines. As input data I have only 
CIs for slope and intercept.


rms/Hmisc packages are very nice, but unfortunately they do not work 
with Passing-Bablok nor 'nls' models.


--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

W dniu 2010-08-10 11:09, David Winsemius pisze:


On Aug 10, 2010, at 10:56 AM, Michal Figurski wrote:


Dear R-helpers and graphics gurus,

I have two problems with plotting confidence bands:

1. First is relatively simple. I am using the Passing-Bablok procedure
to obtain unbiased regression coefficients. This procedure yields
the a  b coefficient values along with their confidence
intervals. I then plot the raw data with the regression line, but I
would like to add the confidence band for the line... and I can't
figure out how to do it.

In other words, given:

Estimate 5%CI 95%CI
Intercept -4.305562 -9.931152 -1.381792
Slope 1.257318 1.053025 1.678516

How to plot the regression line with confidence band?


Take a look at plotCI in either gplots or plotrix packages.

Harrell's rms/Hmisc packages are nicely integrated with lattice and
encourage you to create effective displays of models that remove
simplistic linearity assumptions.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread David Winsemius


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain  
the coordinates* for confidence boundary lines. As input data I have  
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these estimates  
might be valid and to offer the means for the X values? What level of  
R knowledge are you at? You have provided no data or code. Many R  
methods offer predict methods that return CI's.


--
david


rms/Hmisc packages are very nice, but unfortunately they do not work  
with Passing-Bablok nor 'nls' models.


--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

W dniu 2010-08-10 11:09, David Winsemius pisze:


On Aug 10, 2010, at 10:56 AM, Michal Figurski wrote:


Dear R-helpers and graphics gurus,

I have two problems with plotting confidence bands:

1. First is relatively simple. I am using the Passing-Bablok  
procedure

to obtain unbiased regression coefficients. This procedure yields
the a  b coefficient values along with their confidence
intervals. I then plot the raw data with the regression line, but I
would like to add the confidence band for the line... and I can't
figure out how to do it.

In other words, given:

Estimate 5%CI 95%CI
Intercept -4.305562 -9.931152 -1.381792
Slope 1.257318 1.053025 1.678516

How to plot the regression line with confidence band?


Take a look at plotCI in either gplots or plotrix packages.

Harrell's rms/Hmisc packages are nicely integrated with lattice and
encourage you to create effective displays of models that remove
simplistic linearity assumptions.



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Michal Figurski

David,

I would consider myself intermediate in R, but a beginner in statistics. 
I need a formula that would allow me to calculate confidence boundaries 
of the regression line given the slope, intercept and their CIs (and 
*any* range).


Passing-Bablok regression doesn't yet exist in R - I am developing it. 
Therefore I am sure there is no predict method for it ;)


I believe I have provided sufficient data to address this problem, but 
if that would help anyone, here is more:


# data frame
 a - structure(list(x = c(0.1, 1.43, 4.21, 3.67, 3.23, 7.72, 5.99, 
9.16, 10.6, 9.84, 11.94, 12.03, 12.89, 11.26, 15.54, 15.58, 17.32, 
17.65, 19.52, 20.48, 20.44, 20.51, 22.27, 23.58, 25.83, 26.04, 26.92, 
28.44, 30.73, 28.78), y = c(1.08, 1.39, 1.84, 0.56, 7.23, 4.91, 3.35, 
7.09, 3.16, 8.98, 16.37, 7.46, 15.46, 23.2, 4.63, 11.13, 15.68, 13.92, 
26.44, 21.65, 21.01, 20.22, 22.69, 22.21, 23.6, 17.24, 45.24, 30.09, 40, 
49.6)), .Names = c(x, y), row.names = c(NA, -30L), class = data.frame)


Then I run the regression procedure (in development - now part of the 
'MethComp' package):

 print(PBreg(a))

# And the result of the Passing-Bablok regression on this data frame:
   Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy 
prescription for CIs on coefficients, so I implemented that. Now I need 
a way to calculate CI boundaries for individual points - this may be a 
basic handbook stuff - I just don't know it (I'm not a statistician). I 
would appreciate if anyone could point me to a handbook or website where 
it is described.


Regarding 2 - the predict method for 'nls' class currently *ignores* the 
interval parameter - as it is stated in documentation.


Regards

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 11:38, David Winsemius wrote:


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain
the coordinates* for confidence boundary lines. As input data I have
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these estimates
might be valid and to offer the means for the X values? What level of R
knowledge are you at? You have provided no data or code. Many R methods
offer predict methods that return CI's.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Frank Harrell


Please give the prescription.  The article is not available on our 
extensive online library.  I wonder if the method can compete with the 
bootstrap.


Frank

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Tue, 10 Aug 2010, Michal Figurski wrote:


David,

I would consider myself intermediate in R, but a beginner in statistics.
I need a formula that would allow me to calculate confidence boundaries
of the regression line given the slope, intercept and their CIs (and
*any* range).

Passing-Bablok regression doesn't yet exist in R - I am developing it.
Therefore I am sure there is no predict method for it ;)

I believe I have provided sufficient data to address this problem, but
if that would help anyone, here is more:

# data frame
 a - structure(list(x = c(0.1, 1.43, 4.21, 3.67, 3.23, 7.72, 5.99,
9.16, 10.6, 9.84, 11.94, 12.03, 12.89, 11.26, 15.54, 15.58, 17.32,
17.65, 19.52, 20.48, 20.44, 20.51, 22.27, 23.58, 25.83, 26.04, 26.92,
28.44, 30.73, 28.78), y = c(1.08, 1.39, 1.84, 0.56, 7.23, 4.91, 3.35,
7.09, 3.16, 8.98, 16.37, 7.46, 15.46, 23.2, 4.63, 11.13, 15.68, 13.92,
26.44, 21.65, 21.01, 20.22, 22.69, 22.21, 23.6, 17.24, 45.24, 30.09, 40,
49.6)), .Names = c(x, y), row.names = c(NA, -30L), class = data.frame)

Then I run the regression procedure (in development - now part of the
'MethComp' package):
 print(PBreg(a))

# And the result of the Passing-Bablok regression on this data frame:
   Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician). I
would appreciate if anyone could point me to a handbook or website where
it is described.

Regarding 2 - the predict method for 'nls' class currently *ignores* the
interval parameter - as it is stated in documentation.

Regards

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 11:38, David Winsemius wrote:


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain
the coordinates* for confidence boundary lines. As input data I have
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these estimates
might be valid and to offer the means for the X values? What level of R
knowledge are you at? You have provided no data or code. Many R methods
offer predict methods that return CI's.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread David Winsemius


On Aug 10, 2010, at 12:12 PM, Michal Figurski wrote:


David,

I would consider myself intermediate in R, but a beginner in  
statistics. I need a formula that would allow me to calculate  
confidence boundaries of the regression line given the slope,  
intercept and their CIs (and *any* range).


For ordinary regression the CI's for prediction intervals are going  
to be much wider than the CI's for parameter estimates. In both cases  
they are quadratic functions that depend on the mean_x_hat and on  
s_hat^2 (used as an estimate of sigma^2). These formulae should be  
available in any basic regression text. I am sufficiently aware of my  
non-statistician status to know that I could not comment on whether  
naively applying those functions to estimates from another method  
would have validity.


--
David.


Passing-Bablok regression doesn't yet exist in R - I am developing  
it. Therefore I am sure there is no predict method for it ;)


I believe I have provided sufficient data to address this problem,  
but if that would help anyone, here is more:


# data frame
 a - structure(list(x = c(0.1, 1.43, 4.21, 3.67, 3.23, 7.72, 5.99,  
9.16, 10.6, 9.84, 11.94, 12.03, 12.89, 11.26, 15.54, 15.58, 17.32,  
17.65, 19.52, 20.48, 20.44, 20.51, 22.27, 23.58, 25.83, 26.04,  
26.92, 28.44, 30.73, 28.78), y = c(1.08, 1.39, 1.84, 0.56, 7.23,  
4.91, 3.35, 7.09, 3.16, 8.98, 16.37, 7.46, 15.46, 23.2, 4.63, 11.13,  
15.68, 13.92, 26.44, 21.65, 21.01, 20.22, 22.69, 22.21, 23.6, 17.24,  
45.24, 30.09, 40, 49.6)), .Names = c(x, y), row.names = c(NA,  
-30L), class = data.frame)


Then I run the regression procedure (in development - now part of  
the 'MethComp' package):

 print(PBreg(a))

# And the result of the Passing-Bablok regression on this data frame:
  Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy  
prescription for CIs on coefficients, so I implemented that. Now I  
need a way to calculate CI boundaries for individual points - this  
may be a basic handbook stuff - I just don't know it (I'm not a  
statistician). I would appreciate if anyone could point me to a  
handbook or website where it is described.


Regarding 2 - the predict method for 'nls' class currently *ignores*  
the interval parameter - as it is stated in documentation.


Regards

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 11:38, David Winsemius wrote:


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain
the coordinates* for confidence boundary lines. As input data I have
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these  
estimates
might be valid and to offer the means for the X values? What level  
of R
knowledge are you at? You have provided no data or code. Many R  
methods

offer predict methods that return CI's.



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Peter Dalgaard
Michal Figurski wrote:

 # And the result of the Passing-Bablok regression on this data frame:
 Estimate  5%CI 95%CI
 Intercept -4.306197 -9.948438 -1.374663
 Slope  1.257584  1.052696  1.679290
 
 The original Passing  Bablok article on this method has an easy 
 prescription for CIs on coefficients, so I implemented that. Now I need 
 a way to calculate CI boundaries for individual points - this may be a 
 basic handbook stuff - I just don't know it (I'm not a statistician).

The answer is that you can't. You can't even do it with ordinary linear
regression without knowing the correlation between slope and intercept.
However, if you can get a CI for the intercept then you could subtract
x0 from all the x and get a CI for the value at x0.

(This brings echos from a distant past. My master's thesis was about
some similar median-type estimators. I can't remember whether I looked
at the Passing-Bablok paper at the time (1985!!) but my general
recollection is that this group of methods is littered with unstated
assumptions.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Michal Figurski

Frank,

I had to order this article through Inter-Library Loan and wait for it 
for a week!


I'll try to make it short. In Passing-Bablok the principle is to 
calculate slopes between all possible pairs of points in the dataset, 
and then to take a shifted median of those slopes, where the offset is 
the number of slopes of value (-1). Because of this, bootstrap is out 
of question - it would take too much time.


Let n be the number of data points, N - the number of slopes and K - 
the offset. The locations of CI boundaries in the set of N slopes are 
then calculated with formulas:

M1 - N - qnorm(1 - conf.level/2) * sqrt((n*(n-1)*(2*n+5))/18))/2
M2 - N - M1 + 1

CIs for intercept are calculated as medians of y(i) - slopes[M1,M2]*x(i)

I hope I don't confuse anyone.

The article has a mathematical derivations section and justification 
for these formulas. It looks solid to me, although after 25 years the 
flaws may be apparent to some of you.


--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 12:29, Frank Harrell wrote:


Please give the prescription. The article is not available on our
extensive online library. I wonder if the method can compete with the
bootstrap.

Frank

Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University

On Tue, 10 Aug 2010, Michal Figurski wrote:


David,

I would consider myself intermediate in R, but a beginner in statistics.
I need a formula that would allow me to calculate confidence boundaries
of the regression line given the slope, intercept and their CIs (and
*any* range).

Passing-Bablok regression doesn't yet exist in R - I am developing it.
Therefore I am sure there is no predict method for it ;)

I believe I have provided sufficient data to address this problem, but
if that would help anyone, here is more:

# data frame
 a - structure(list(x = c(0.1, 1.43, 4.21, 3.67, 3.23, 7.72, 5.99,
9.16, 10.6, 9.84, 11.94, 12.03, 12.89, 11.26, 15.54, 15.58, 17.32,
17.65, 19.52, 20.48, 20.44, 20.51, 22.27, 23.58, 25.83, 26.04, 26.92,
28.44, 30.73, 28.78), y = c(1.08, 1.39, 1.84, 0.56, 7.23, 4.91, 3.35,
7.09, 3.16, 8.98, 16.37, 7.46, 15.46, 23.2, 4.63, 11.13, 15.68, 13.92,
26.44, 21.65, 21.01, 20.22, 22.69, 22.21, 23.6, 17.24, 45.24, 30.09, 40,
49.6)), .Names = c(x, y), row.names = c(NA, -30L), class =
data.frame)

Then I run the regression procedure (in development - now part of the
'MethComp' package):
 print(PBreg(a))

# And the result of the Passing-Bablok regression on this data frame:
Estimate 5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope 1.257584 1.052696 1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician). I
would appreciate if anyone could point me to a handbook or website where
it is described.

Regarding 2 - the predict method for 'nls' class currently *ignores* the
interval parameter - as it is stated in documentation.

Regards

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 11:38, David Winsemius wrote:


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain
the coordinates* for confidence boundary lines. As input data I have
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these estimates
might be valid and to offer the means for the X values? What level of R
knowledge are you at? You have provided no data or code. Many R methods
offer predict methods that return CI's.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Michal Figurski

Peter,

Since in PB the procedure is to calculate a whole list of slopes and 
intercepts, wouldn't it be a solution to determine the correlation and 
go from there? How do I do it?


--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 13:12, Peter Dalgaard wrote:

Michal Figurski wrote:


# And the result of the Passing-Bablok regression on this data frame:
 Estimate  5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope  1.257584  1.052696  1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician).


The answer is that you can't. You can't even do it with ordinary linear
regression without knowing the correlation between slope and intercept.
However, if you can get a CI for the intercept then you could subtract
x0 from all the x and get a CI for the value at x0.

(This brings echos from a distant past. My master's thesis was about
some similar median-type estimators. I can't remember whether I looked
at the Passing-Bablok paper at the time (1985!!) but my general
recollection is that this group of methods is littered with unstated
assumptions.)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Frank Harrell



Thanks Michael,

That's the method that Dana Quade taught me in his intro 
nonparametrics course at UNC in the mid 1970s, at least for a single 
predictor.  His method did not incorporate the shift you mentioned 
though.  The method looks robust.  Not sure about efficiency.


Frank

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Tue, 10 Aug 2010, Michal Figurski wrote:


Frank,

I had to order this article through Inter-Library Loan and wait for it
for a week!

I'll try to make it short. In Passing-Bablok the principle is to
calculate slopes between all possible pairs of points in the dataset,
and then to take a shifted median of those slopes, where the offset is
the number of slopes of value (-1). Because of this, bootstrap is out
of question - it would take too much time.

Let n be the number of data points, N - the number of slopes and K -
the offset. The locations of CI boundaries in the set of N slopes are
then calculated with formulas:
M1 - N - qnorm(1 - conf.level/2) * sqrt((n*(n-1)*(2*n+5))/18))/2
M2 - N - M1 + 1

CIs for intercept are calculated as medians of y(i) - slopes[M1,M2]*x(i)

I hope I don't confuse anyone.

The article has a mathematical derivations section and justification
for these formulas. It looks solid to me, although after 25 years the
flaws may be apparent to some of you.

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 12:29, Frank Harrell wrote:


Please give the prescription. The article is not available on our
extensive online library. I wonder if the method can compete with the
bootstrap.

Frank

Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University

On Tue, 10 Aug 2010, Michal Figurski wrote:


David,

I would consider myself intermediate in R, but a beginner in statistics.
I need a formula that would allow me to calculate confidence boundaries
of the regression line given the slope, intercept and their CIs (and
*any* range).

Passing-Bablok regression doesn't yet exist in R - I am developing it.
Therefore I am sure there is no predict method for it ;)

I believe I have provided sufficient data to address this problem, but
if that would help anyone, here is more:

# data frame

a - structure(list(x = c(0.1, 1.43, 4.21, 3.67, 3.23, 7.72, 5.99,

9.16, 10.6, 9.84, 11.94, 12.03, 12.89, 11.26, 15.54, 15.58, 17.32,
17.65, 19.52, 20.48, 20.44, 20.51, 22.27, 23.58, 25.83, 26.04, 26.92,
28.44, 30.73, 28.78), y = c(1.08, 1.39, 1.84, 0.56, 7.23, 4.91, 3.35,
7.09, 3.16, 8.98, 16.37, 7.46, 15.46, 23.2, 4.63, 11.13, 15.68, 13.92,
26.44, 21.65, 21.01, 20.22, 22.69, 22.21, 23.6, 17.24, 45.24, 30.09, 40,
49.6)), .Names = c(x, y), row.names = c(NA, -30L), class =
data.frame)

Then I run the regression procedure (in development - now part of the
'MethComp' package):

print(PBreg(a))


# And the result of the Passing-Bablok regression on this data frame:
Estimate 5%CI 95%CI
Intercept -4.306197 -9.948438 -1.374663
Slope 1.257584 1.052696 1.679290

The original Passing  Bablok article on this method has an easy
prescription for CIs on coefficients, so I implemented that. Now I need
a way to calculate CI boundaries for individual points - this may be a
basic handbook stuff - I just don't know it (I'm not a statistician). I
would appreciate if anyone could point me to a handbook or website where
it is described.

Regarding 2 - the predict method for 'nls' class currently *ignores* the
interval parameter - as it is stated in documentation.

Regards

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney
Philadelphia, PA 19104
tel. (215) 662-3413

On 2010-08-10 11:38, David Winsemius wrote:


On Aug 10, 2010, at 11:23 AM, Michal Figurski wrote:


David,

I may have stated my problem incorrectly - my problem is to *obtain
the coordinates* for confidence boundary lines. As input data I have
only CIs for slope and intercept.


Wouldn't you also need to specify the range over which these estimates
might be valid and to offer the means for the X values? What level of R
knowledge are you at? You have provided no data or code. Many R methods
offer predict methods that return CI's.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.