[R] Problem with R density function

2014-05-14 Thread DHIMAN BHADRA
Hello,
My friend has the following issue with R. I will be glad to receive any
response.
Thanks,
Dhiman Bhadra

Hello everyone,

I am trying to use the 'density' function available with the base package
of R to estimate the density of a data set for subsequent use. I just
noticed that with even 1000 data points, the numerical integral of the
estimated density using the Epanechnikov kernel is far from 1. I wonder if
I am doing something wrong, or whether there is a bug:

x=rnorm(1)
 dd=density(x,kernel=epanechnikov,n=101,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 5.7245

 dd=density(x,kernel=epanechnikov,n=1001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 2.870922

 dd=density(x,kernel=epanechnikov,n=10001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 0.9989762

So unless I use around 1 or more data points, the integral is wrong:
there seems to be a scaling factor creeping in. Am I missing something?


Best regards,
*Apratim Guha*

__
*Dr. Apratim Guha*
*Associate Professor, Production  Quantitative Methods Area, IIM
Ahmedabad, *

*Vastrapur, Ahmedabad 380015, INDIA. Phone: (91) 79 6632 4803*
*Secretary: Ms. Sujatha Jayprakash: (91) 79 6632 4911*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with R density function

2014-05-14 Thread Martyn Byng
Hi,

Have you tried using a different bandwidth rather than the number of points,  
the default bandwidth gives ...

x - rnorm(1)
dd - density(x,kernel=epanechnikov,n=101)
sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 1.001014

Martyn
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of DHIMAN BHADRA
Sent: 14 May 2014 10:36
To: r-help@r-project.org
Subject: [R] Problem with R density function

Hello,
My friend has the following issue with R. I will be glad to receive any 
response.
Thanks,
Dhiman Bhadra

Hello everyone,

I am trying to use the 'density' function available with the base package of R 
to estimate the density of a data set for subsequent use. I just noticed that 
with even 1000 data points, the numerical integral of the estimated density 
using the Epanechnikov kernel is far from 1. I wonder if I am doing something 
wrong, or whether there is a bug:

x=rnorm(1)
 dd=density(x,kernel=epanechnikov,n=101,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 5.7245

 dd=density(x,kernel=epanechnikov,n=1001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 2.870922

 dd=density(x,kernel=epanechnikov,n=10001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 0.9989762

So unless I use around 1 or more data points, the integral is wrong:
there seems to be a scaling factor creeping in. Am I missing something?


Best regards,
*Apratim Guha*

__
*Dr. Apratim Guha*
*Associate Professor, Production  Quantitative Methods Area, IIM Ahmedabad, *

*Vastrapur, Ahmedabad 380015, INDIA. Phone: (91) 79 6632 4803*
*Secretary: Ms. Sujatha Jayprakash: (91) 79 6632 4911*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:3}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with R density function

2014-05-14 Thread Andrews, Chris
Try adding plots, e.g. 

set.seed(20140514)
x - rnorm(100)
hist(x, prob=TRUE, ylim=c(0,10))
dd - density(x, n=10001, bw=0.001)
lines(dd, col=2, type=s)
dd - density(x, n=101, bw=0.001)
lines(dd, col=3, type=s)

The density function you produce with bw=0.001 is very irregular (many sharp, 
narrow peaks).  You should expect to need many intervals (i.e., large n) in 
your Reimann integral to get an accurate estimate of the area under it.

Chris


-Original Message-
From: Martyn Byng [mailto:martyn.b...@nag.co.uk] 
Sent: Wednesday, May 14, 2014 5:58 AM
To: DHIMAN BHADRA; r-help@r-project.org
Subject: Re: [R] Problem with R density function

Hi,

Have you tried using a different bandwidth rather than the number of points,  
the default bandwidth gives ...

x - rnorm(1)
dd - density(x,kernel=epanechnikov,n=101)
sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 1.001014

Martyn
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of DHIMAN BHADRA
Sent: 14 May 2014 10:36
To: r-help@r-project.org
Subject: [R] Problem with R density function

Hello,
My friend has the following issue with R. I will be glad to receive any 
response.
Thanks,
Dhiman Bhadra

Hello everyone,

I am trying to use the 'density' function available with the base package of R 
to estimate the density of a data set for subsequent use. I just noticed that 
with even 1000 data points, the numerical integral of the estimated density 
using the Epanechnikov kernel is far from 1. I wonder if I am doing something 
wrong, or whether there is a bug:

x=rnorm(1)
 dd=density(x,kernel=epanechnikov,n=101,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 5.7245

 dd=density(x,kernel=epanechnikov,n=1001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 2.870922

 dd=density(x,kernel=epanechnikov,n=10001,bw=0.001)
 sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 0.9989762

So unless I use around 1 or more data points, the integral is wrong:
there seems to be a scaling factor creeping in. Am I missing something?


Best regards,
*Apratim Guha*

__
*Dr. Apratim Guha*
*Associate Professor, Production  Quantitative Methods Area, IIM Ahmedabad, *

*Vastrapur, Ahmedabad 380015, INDIA. Phone: (91) 79 6632 4803*
*Secretary: Ms. Sujatha Jayprakash: (91) 79 6632 4911*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:7}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.