Re: [R] optim bug/help?

2008-10-23 Thread rkevinburton
I was just trying to get a feel for the general strengths and weaknesses of the 
algoritmns. This is obviously a demo function and probably not representative 
of the real optimization problems that I will face. My concern was that if 
there is inaccuracy when I know the answer it might be worse when I don't if I 
don't understand the characteristics of the algoritmns involved.

Kevin

 Ben Bolker [EMAIL PROTECTED] wrote: 
  rkevinburton at charter.net writes:
 
  
  In the documentation for 'optim' it gives the following function:
  
  fr - function(x) {   ## Rosenbrock Banana function
  x1 - x[1]
  x2 - x[2]
  100 * (x2 - x1 * x1)^2 + (1 - x1)^2
  }
  optim(c(-1.2,1), fr)
  
  When I run this code I get:
  
  $par
  [1] 1.000260 1.000506
  
  I am sure I am missing something but why isn't 1,1 a better answer? If I 
  plug
 1,1 in the function it seems that
  the function is zero. Whereas the answer given gives a function result of
 8.82e-8. This was after 195 calls
  to the function (as reported by optim). The documentation indicates that the
 'reltol' is about 1e-8. Is
  this the limit that I am bumping up against?
  
  Kevin
  
 
   Yes, this is basically just numeric fuzz, the
 bulk of which probably comes from finite-difference
 evaluation of the derivative.
 As demonstrated below, you can get a lot closer
 by defining an analytic gradient function.
 May I ask why this level of accuracy is important?
 (Insert Tukey quotation here about approximate answers
 to the right question here ...)
 
   Ben Bolker
 -
 
 fr - function(x) {   ## Rosenbrock Banana function
 x1 - x[1]
 x2 - x[2]
 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 }
 
 ## gradient function
 frg - function(x) {
   x1 - x[1]
   x2 - x[2]
   c(100*(4*x1^3-2*x2*2*x1)-2+2*x1,
 100*(2*x2-2*x1^2))
 }
 
 ## use numericDeriv to double-check my calculus
 x1 - 1.5
 x2 - 1.7
 numericDeriv(quote(fr(c(x1,x2))),c(x1,x2))
 frg(c(x1,x2))
 
 ##
 optim(c(-1.2,1), fr)  ## Nelder-Mead
 optim(c(-1.2,1), fr,method=BFGS)
 optim(c(-1.2,1), fr, gr=frg,method=BFGS)  ## use gradient
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optim bug/help?

2008-10-23 Thread Patrick Burns

[EMAIL PROTECTED] wrote:

I was just trying to get a feel for the general strengths and weaknesses of the algoritmns. This is 
obviously a demo function and probably not representative of the real 
optimization problems that I will face. My concern was that if there is inaccuracy when I know the 
answer it might be worse when I don't if I don't understand the characteristics of the algoritmns 
involved.
  


One approach to this problem is to do the optimization
several times with different (random?) starts.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Kevin

 Ben Bolker [EMAIL PROTECTED] wrote: 
  

 rkevinburton at charter.net writes:



In the documentation for 'optim' it gives the following function:

fr - function(x) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
optim(c(-1.2,1), fr)

When I run this code I get:

$par
[1] 1.000260 1.000506

I am sure I am missing something but why isn't 1,1 a better answer? If I plug
  

1,1 in the function it seems that


the function is zero. Whereas the answer given gives a function result of
  

8.82e-8. This was after 195 calls


to the function (as reported by optim). The documentation indicates that the
  

'reltol' is about 1e-8. Is


this the limit that I am bumping up against?

Kevin

  

  Yes, this is basically just numeric fuzz, the
bulk of which probably comes from finite-difference
evaluation of the derivative.
As demonstrated below, you can get a lot closer
by defining an analytic gradient function.
May I ask why this level of accuracy is important?
(Insert Tukey quotation here about approximate answers
to the right question here ...)

  Ben Bolker
-

fr - function(x) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}

## gradient function
frg - function(x) {
  x1 - x[1]
  x2 - x[2]
  c(100*(4*x1^3-2*x2*2*x1)-2+2*x1,
100*(2*x2-2*x1^2))
}

## use numericDeriv to double-check my calculus
x1 - 1.5
x2 - 1.7
numericDeriv(quote(fr(c(x1,x2))),c(x1,x2))
frg(c(x1,x2))

##
optim(c(-1.2,1), fr)  ## Nelder-Mead
optim(c(-1.2,1), fr,method=BFGS)
optim(c(-1.2,1), fr, gr=frg,method=BFGS)  ## use gradient

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optim bug/help?

2008-10-23 Thread Ravi Varadhan
Hi Kevin,

There is no substitute to knowing your problem as well as you possibly can
(I know this can be challenging in complicated, real problems).  If you plot
the contours of the function, you will see that they are banana shaped
around the solution, i.e. the objective function is not very sensitive to
changes along the length of the banana, but changes rapidly along the
perpendicular direction.  A concise way of saying all this is to say that
the problem is ill-conditioned.  A consequence of such ill-conditioning is
that the numerical solution becomes highly sensitive to perturbations in
data, where the term data includes all the parameters (related to both
the function and the numerical algorithm) that are required to solve the
problem.  The step-size for computing finite-difference approximation of
gradient is a datum that has a significant impact on the solution, as Ben
had pointed out, because of this ill-conditioning.  Starting value is
another crucial piece of data, as Patrick had pointed out.

In the Rosenbrock example, the problem is due to the scaling factor 100 in
the Rosenbrock function.  Increase this, and the problem becomes more
severe.  

One way to get a sense of the degree of ill-conditioning is to look at the
ratio of the largest to smallest eigen value of the inverse of the Hessian
matrix.

 ans - optim(c(-1,1), fr, grr, method = BFGS, hessian=TRUE) 
 eigen( solve(ans$hess) )$val

Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of [EMAIL PROTECTED]
Sent: Thursday, October 23, 2008 4:03 PM
To: [EMAIL PROTECTED]; Ben Bolker
Subject: Re: [R] optim bug/help?

I was just trying to get a feel for the general strengths and weaknesses of
the algoritmns. This is obviously a demo function and probably not
representative of the real optimization problems that I will face. My
concern was that if there is inaccuracy when I know the answer it might be
worse when I don't if I don't understand the characteristics of the
algoritmns involved.

Kevin

 Ben Bolker [EMAIL PROTECTED] wrote: 
  rkevinburton at charter.net writes:
 
  
  In the documentation for 'optim' it gives the following function:
  
  fr - function(x) {   ## Rosenbrock Banana function
  x1 - x[1]
  x2 - x[2]
  100 * (x2 - x1 * x1)^2 + (1 - x1)^2 } optim(c(-1.2,1), fr)
  
  When I run this code I get:
  
  $par
  [1] 1.000260 1.000506
  
  I am sure I am missing something but why isn't 1,1 a better answer? 
  If I plug
 1,1 in the function it seems that
  the function is zero. Whereas the answer given gives a function 
  result of
 8.82e-8. This was after 195 calls
  to the function (as reported by optim). The documentation indicates 
  that the
 'reltol' is about 1e-8. Is
  this the limit that I am bumping up against?
  
  Kevin
  
 
   Yes, this is basically just numeric fuzz, the bulk of which probably 
 comes from finite-difference evaluation of the derivative.
 As demonstrated below, you can get a lot closer by defining an 
 analytic gradient function.
 May I ask why this level of accuracy is important?
 (Insert Tukey quotation here about approximate answers to the right 
 question here ...)
 
   Ben Bolker
 -
 
 fr - function(x) {   ## Rosenbrock Banana function
 x1 - x[1]
 x2 - x[2]
 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 }
 
 ## gradient function
 frg - function(x) {
   x1 - x[1]
   x2 - x[2]
   c(100*(4*x1^3-2*x2*2*x1)-2+2*x1,
 100*(2*x2-2*x1^2))
 }
 
 ## use numericDeriv to double-check my calculus
 x1 - 1.5
 x2 - 1.7
 numericDeriv(quote(fr(c(x1,x2))),c(x1,x2))
 frg(c(x1,x2))
 
 ##
 optim(c(-1.2,1), fr)  ## Nelder-Mead
 optim(c(-1.2,1), fr,method=BFGS)
 optim(c(-1.2,1), fr, gr=frg,method=BFGS)  ## use gradient
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

[R] optim bug/help?

2008-10-22 Thread rkevinburton
In the documentation for 'optim' it gives the following function:

fr - function(x) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
optim(c(-1.2,1), fr)

When I run this code I get:

$par
[1] 1.000260 1.000506

I am sure I am missing something but why isn't 1,1 a better answer? If I plug 
1,1 in the function it seems that the function is zero. Whereas the answer 
given gives a function result of 8.82e-8. This was after 195 calls to the 
function (as reported by optim). The documentation indicates that the 'reltol' 
is about 1e-8. Is this the limit that I am bumping up against?



Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optim bug/help?

2008-10-22 Thread Ben Bolker
 rkevinburton at charter.net writes:

 
 In the documentation for 'optim' it gives the following function:
 
 fr - function(x) {   ## Rosenbrock Banana function
 x1 - x[1]
 x2 - x[2]
 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 }
 optim(c(-1.2,1), fr)
 
 When I run this code I get:
 
 $par
 [1] 1.000260 1.000506
 
 I am sure I am missing something but why isn't 1,1 a better answer? If I plug
1,1 in the function it seems that
 the function is zero. Whereas the answer given gives a function result of
8.82e-8. This was after 195 calls
 to the function (as reported by optim). The documentation indicates that the
'reltol' is about 1e-8. Is
 this the limit that I am bumping up against?
 
 Kevin
 

  Yes, this is basically just numeric fuzz, the
bulk of which probably comes from finite-difference
evaluation of the derivative.
As demonstrated below, you can get a lot closer
by defining an analytic gradient function.
May I ask why this level of accuracy is important?
(Insert Tukey quotation here about approximate answers
to the right question here ...)

  Ben Bolker
-

fr - function(x) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}

## gradient function
frg - function(x) {
  x1 - x[1]
  x2 - x[2]
  c(100*(4*x1^3-2*x2*2*x1)-2+2*x1,
100*(2*x2-2*x1^2))
}

## use numericDeriv to double-check my calculus
x1 - 1.5
x2 - 1.7
numericDeriv(quote(fr(c(x1,x2))),c(x1,x2))
frg(c(x1,x2))

##
optim(c(-1.2,1), fr)  ## Nelder-Mead
optim(c(-1.2,1), fr,method=BFGS)
optim(c(-1.2,1), fr, gr=frg,method=BFGS)  ## use gradient

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.