You don't seem to be making any corrections or updating your code. There remains a syntax error in the last line of cvhfunc because of mismatched parens.

On Jun 20, 2009, at 1:04 PM, muddz wrote:


Hi David,

Thanks and I apologize for the lack of clarity.

#n is defined as the length of xdat
n<-length(xdat)

Would be better to include that at the top of  the function body.

#I defined 'k' as the Gaussian kernel function
k<-function(v) {1/sqrt(2*pi)*exp(-v^2/2)} #GAUSSIAN kernal

#I believe ypred in my case, was the leave one out estimator (I think its
the variable x, in other words xdat or independent variable).

It does not make sense to me that ypred or yhat would be an estimator of x.

I could be wrong, but from the text of Davidson and MacKinnon, thats what I got out of
it.

Don't have whatever text. The Google hits suggest it might be "Econometric Theory and Methods". Amazon has one of those look inside options. Searching on cv(h) suggests you are trying to replicate kernel regression on p 686.

#I totally hear you on the fact that ypred will just equal y[i], however it should not be the case because for each x[i], I should be running all x[j] except for when x[i]=x[j]. And I should be mulitiplying the numerator of
ypred (y[i]*k( (x[j]-x[i])/h ). However, its not doing that.

Not doing what? You need to decide what is wrong.

(y[i]*k((x[j]-x[i])/h))/k((x[j]-x[i])/h) = y[i] * 1 # by definition, unless x[j] = x[i], in which case it is NaN

... which is what in the vector that your current function produces after fixing some syntax. Doing some debugging it looks as though the first time through with i=1 and j=2 that the results of the calucation is NaN which means that ypred will always be NaN. Just because it is done over a loop does not fix the error in numerical logic and failure to check arguments. So you still need to decide how to trap situations where x[j] = x[i]. And you need to re-read D&M and implement the weights that prevent using using terms when k((x[j]- x[i])/h) is very small.

(There is only a single summation index in that CV(h) formula. So maybe you could just run predict(lm(y ~ . , data=x[-i])) instead of that god-awful nested loop.)


#I believe CV should be the following: It is used to determine optimal
bandwidth. Steps:

If CV is a scalar then the last line could not possibly give you what you want:

> (1 - 1:10)   #scalar minus vector gives vector.
 [1]  0 -1 -2 -3 -4 -5 -6 -7 -8 -9

Don't you want a sum?

(1)The process involves running x, [j] times for each x[i], except for when
x[j]=x[i]. This is similiar to drawing a histogram and finding how how
large/small the bins should be. ypred should be a vector of nx1


(2) The second step is similiar to a variance measure, where take the
difference of y and (1), square, and than sum for all n.

You are going to need to define "second step".

# I was not sure with this> " ypred

That line looks useless to me at the moment. ypred is whatever it is at that point, and it won't return a value from the function because you later evaluate other expressions. Functions only return the last evaluated expression or what is wrapped in return().


# not sure what that will do inside
the function. If
it's there for debugging you may want to print(ypred)".

Thanks.
-Mudasir



                            }

David Winsemius wrote:


On Jun 19, 2009, at 7:45 PM, muddz wrote:


Hi Uwe,

My apologies.

Please if I can be guided what I am doing wrong in the code. I
started my
code as such:

# ypred is my leave one out estimator of x

Estimator of x? Really?

cvhfunc<-function(y,x,h)    {
   ypred<-0
   for (i in 1:n){   #how did you define "n"? It's not in your
parameter list
      for (j in 1:n){
      if (j!=i){
           ypred<-ypred +    (y[i]*k( (x[j]-x[i])/h ) )/
                                      k( (x[j]-x[i])/h )

At this point you are also using "k" as a function. Have you at any
point defined "k"?

Also multiplication of the ratio of two identical numbers would
generally give a result of y[i] for that second term. Unless, of
course, any x[j] = x[i] in which case you will throw an error and
every thing will grind to a halt.

It might help if you now explained what you think "CV" should be.

               }
                 }   }
   ypred     # not sure what that will do inside the function. If
it's there for debugging you may want to print(ypred)

#CVh is a

# Yes? CVh is a ....what ?

    cvh<-0
    cvh<-cvh+((1/n)*(y-ypred)^2    # n again. R will still not know
what that is.
    cvh

# ypred is a scalar, while y is a vector, so cvh will be a vector. Is
that what you want?

                            }

test2<-cvhfunc(ydat,xdat,.2);test2

#I was experimenting with the following data:
library(datasets)
data(faithful)
ydat<-faithful$eruptions;ydat;plot(ydat);par(new=T)
xdat<-faithful$waiting;xdat;plot(xdat,col="blue")

# I want to minimize the CV function with respect to h. Thanks.




Uwe Ligges-3 wrote:

See the posting guide:
If you provide commented, minimal, self-contained, reproducible code
some people may be willing to help on the list.

Best,
Uwe Ligges


muddz wrote:
Hi All,

I have been trying to get this LOO-Cross Validation method to work
on R
for
the past 3 weeks but have had no luck. I am hoping someone would
kindly
help
me.

Essentially I want to code the LOO-Cross Validation for the 'Local
Constant'
and 'Local Linear' constant estimators. I want to find optimal h,
bandwidth.

Thank you very much!
-M



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
View this message in context:
http://www.nabble.com/Leave-One-Out-Cross-Validation-tp24025738p24120380.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
View this message in context: 
http://www.nabble.com/Leave-One-Out-Cross-Validation-tp24025738p24127218.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to