Re: [R] Probability weights with density estimation

2008-01-17 Thread David Winsemius
Charles C. Berry [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 On Wed, 16 Jan 2008, David Winsemius wrote:
 
   I am a physician examining an NHANES dataset available at the
   NCHS website:
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/demo_d.xpt
snip

 TC.ran - exp(rnorm(400,1.5,.3))
 HDL.ran - exp(rnorm(400,.4,.3) )

 f1-kde2d(HDL.ran,TC.ran,n=25,lims=c(0,4,2,10))

 contour(f1$x,f1$y,f1$z,ylim=c(0,8),xlim=c(0,3),ylab=TC mmol/L,
  xlab=HDL mmol/L)
 lines(f1$x,5*f1$x)   # iso-ratio lines
 lines(f1$x,4*f1$x)
 lines(f1$x,3*f1$x)

 Two questions:
 Is there a 2d density estimation function that has provision for
 probability weights (or inverse sampling probabilities)? 
snip

 
 It looks like you can use bkde2D from the KernSmooth package.
 
 You might look at the function sqlocpoly in surveyNG which uses 
 the KernSmooth package for details.

The prospect of setting up an SQL database was rather daunting and I 
continued my search. There were references in the the sql.. functions' 
documentation that they were providing the functions in package Locfit. 
Finding locfit() provided the weighting options I needed. This is what 
I came up with:

tc.hdl.fit - with(small.nh.chol,
   locfit(~LBDHDDSI+LBDTCSI,
 weights=WTMEC2YR, 
 xlim=c(0,0,4,10)
 ) 
) 
plot(tc.hdl.fit)#give warnings but does work  
title(main=Weighted, xlab=HDL, ylab=TC)
# add labels _after_ plotting.
# never could figure out how to get plot() to accept xlab or ylab 
# when passing the locfit object to it. 
with(tc.hdl.fit, lines(x,x*4))   

-- 
Thanks; 
and thank you, Andy Liaw, for helpful earlier posts;
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Probability weights with density estimation

2008-01-16 Thread Charles C. Berry
On Wed, 16 Jan 2008, David Winsemius wrote:

   I am a physician examining an NHANES dataset available at the NCHS
 website:
 http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/demo_d.xpt
 http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/hdl_d.xpt
 http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/tchol_d.xpt

   Thank you to the R authors and the foreign package authors in
 particular. Importing from the SAS export fomat file was a snap. It
 consists of demographic data linked to laboratory measurements. Each
 subject has an associated sampling weight. I have gotten informative
 displays following the examples using kde2d() in VR MASSe2 (more
 thanks), but these were unweighted analyses. The ratio of total
 cholesterol (TC) to HDL cholesterol is used clinically to estimate risk
 of future heart disease, and I am looking at how such ratios divide
 or intersect with the TC x HDL-C distribution. Rather than include all
 the real data, let me just post a simulation that shows a contourplot
 reasonably similar to what I am seeing.

 TC.ran - exp(rnorm(400,1.5,.3))
 HDL.ran - exp(rnorm(400,.4,.3) )

 f1-kde2d(HDL.ran,TC.ran,n=25,lims=c(0,4,2,10))

 contour(f1$x,f1$y,f1$z,ylim=c(0,8),xlim=c(0,3),ylab=TC mmol/L,
  xlab=HDL mmol/L)
 lines(f1$x,5*f1$x)   # iso-ratio lines
 lines(f1$x,4*f1$x)
 lines(f1$x,3*f1$x)

 Two questions:
 Is there a 2d density estimation function that has provision for
 probability weights (or inverse sampling probabilities)? I seem to
 remember a discussion on the list about whether such a procedure would
 be meaningful, but my searches cannot locate that thread or any worked
 examples that incorporate sampling weights.


It looks like you can use bkde2D from the KernSmooth package.

You might look at the function sqlocpoly in surveyNG which uses 
the KernSmooth package for details.



 If there is such a function, would it be a simple matter to calculate
 the proportion of the total population that would be expected to have a
 ratio of y.ran/x.ran of less than a particular number, say 4.0?

Maybe my eyesight is failing, but I did not see where you define 'y.ran' 
and 'x.ran'. If they, like 'TC.ran' and 'HDL.ran', are just variables that 
are dierctly measured in your survey, then estimating the proportion less 
than a given value for y.ran/x.ran is standard survey sampling fare and no 
density estimation is needed. In which case, the 'survey' package at CRAN 
is what you want.

HTH,

Chuck


 -- 
 Respectfully;
 David Winsemius

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Probability weights with density estimation

2008-01-15 Thread David Winsemius
   I am a physician examining an NHANES dataset available at the NCHS 
website: 
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/demo_d.xpt
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/hdl_d.xpt
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/tchol_d.xpt

   Thank you to the R authors and the foreign package authors in 
particular. Importing from the SAS export fomat file was a snap. It 
consists of demographic data linked to laboratory measurements. Each 
subject has an associated sampling weight. I have gotten informative 
displays following the examples using kde2d() in VR MASSe2 (more 
thanks), but these were unweighted analyses. The ratio of total 
cholesterol (TC) to HDL cholesterol is used clinically to estimate risk 
of future heart disease, and I am looking at how such ratios divide 
or intersect with the TC x HDL-C distribution. Rather than include all 
the real data, let me just post a simulation that shows a contourplot 
reasonably similar to what I am seeing.

TC.ran - exp(rnorm(400,1.5,.3))
HDL.ran - exp(rnorm(400,.4,.3) )

f1-kde2d(HDL.ran,TC.ran,n=25,lims=c(0,4,2,10))

contour(f1$x,f1$y,f1$z,ylim=c(0,8),xlim=c(0,3),ylab=TC mmol/L,
  xlab=HDL mmol/L)
lines(f1$x,5*f1$x)   # iso-ratio lines
lines(f1$x,4*f1$x)
lines(f1$x,3*f1$x)

Two questions:
Is there a 2d density estimation function that has provision for 
probability weights (or inverse sampling probabilities)? I seem to 
remember a discussion on the list about whether such a procedure would 
be meaningful, but my searches cannot locate that thread or any worked 
examples that incorporate sampling weights. 

If there is such a function, would it be a simple matter to calculate 
the proportion of the total population that would be expected to have a 
ratio of y.ran/x.ran of less than a particular number, say 4.0?

-- 
Respectfully;
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.