On Sat, Sep 24, 2011 at 3:19 PM, John Pearson <johnepear...@gmail.com> wrote: > Then where do the constraints g=0 and h = 0 go? > > On Sat, Sep 24, 2011 at 3:21 AM, Juan Pablo Carbajal <carba...@ifi.uzh.ch> > wrote: >> >> Hi John, >> >> SQP uses the gradient of phi, a non-linear function to calculate a >> local second order estimate of phi (Taylor series) and uses qp() to do >> the next step of the optimization. The name sequential quadratic >> programming comes from this fact. This is clearly stated in the >> documentation as of >= 3.4 (If I am not wrong) >> "The first element should point to the objective >> function, the second should point to a function that computes the >> gradient of the objective function, and the third should point to a >> function that computes the Hessian of the objective function. If >> the gradient function is not supplied, the gradient is computed by >> finite differences. If the Hessian function is not supplied, a >> BFGS update formula is used to approximate the Hessian." >> >> Check the code inside, between the lines 370 and 445, there is more or >> less where the second order estimate is calculated. >> >> The arguments of the "1st derivative" and the "2nd derivative" of a >> function are the same arguments of the function. >> f(x) ~ f(x0) + G(f)(x0) * (x-x0) + 0.5 * (x-x0) * H(f)(x0) (x-x0) >> >> Regards, >> >> On Fri, Sep 23, 2011 at 11:36 PM, John Pearson <johnepear...@gmail.com> >> wrote: >> > g is an equality constraint. h is an inequality constraint. >> > If G and H are the gradient and hessian of phi which argument's are >> > they. >> > The documentation makes it look as if they go in the 3rd and 4th spots >> > which >> > the documentation says are inequality constraints. >> > Here is what the documentation says: "When supplied, the gradient >> > function >> > must be of the form >> >> G = gradient (X) >> >> in which X is a vector and G is a vector. >> >> When supplied, the Hessian function must be of the form >> >> H = hessian (X) >> >> in which X is a vector and H is a matrix." >> > It saws that X is a vector and G is a vector. If they mean the gradient >> > of >> > phi then it ought to say the gradient of phi, but I suspect that sqp >> > doesn't >> > use gradient information at all. I am achieving success with it by >> > passing >> > an inequality constraint into the h slot (the 4th argument). I suspect >> > that >> > the code cannot use the gradient and hessian. >> > >> > On Fri, Sep 23, 2011 at 3:28 PM, Juan Pablo Carbajal >> > <carba...@ifi.uzh.ch> >> > wrote: >> >> >> >> On Fri, Sep 23, 2011 at 6:18 PM, John Pearson <johnepear...@gmail.com> >> >> wrote: >> >> > I'm not sure that this is the proper way to address documentation >> >> > issues >> >> > but >> >> > I think the >> >> > help page for "sqp" (sequential quadratic programming) is >> >> > inconsistent. >> >> > Specifically there are two inconsistent >> >> > definitions for the 3rd and 4th arguments (G and H). >> >> > -- Function File: [X, OBJ, INFO, ITER, NF, LAMBDA] = sqp (X0, PHI) >> >> > -- Function File: [...] = sqp (X0, PHI, G) >> >> > -- Function File: [...] = sqp (X0, PHI, G, H) >> >> > -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB) >> >> > -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB, MAXITER) >> >> > -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB, MAXITER, >> >> > TOLERANCE) >> >> > Solve the nonlinear program >> >> > min phi (x) >> >> > x >> >> > subject to >> >> > g(x) = 0 >> >> > h(x) >= 0 >> >> > lb <= x <= ub >> >> > >> >> > >> >> > THen it says: >> >> > When supplied, the gradient function must be of the form >> >> > G = gradient (X) >> >> > in which X is a vector and G is a vector. >> >> > When supplied, the Hessian function must be of the form >> >> > H = hessian (X) >> >> > in which X is a vector and H is a matrix. >> >> > The third and fourth arguments are function handles pointing to >> >> > functions that compute the equality constraints and the >> >> > inequality >> >> > constraints, respectively. >> >> > Calling G the gradient of X doesn't make any sense to me. IF X is a >> >> > vector >> >> > it's gradient should be a matrix in any event, >> >> > either g is a constraint or it isn't. If it is a constraint it isn't >> >> > a >> >> > gradient and similarly H isn't a Hessian but an inequality >> >> > constraint. >> >> > WHich >> >> > are they? They can't be both! >> >> > Best regards and thanks for your great work in making Octave a great >> >> > open >> >> > source software tool. >> >> > John >> >> > -- >> >> > John E. Pearson >> >> > >> >> > >> >> > _______________________________________________ >> >> > Help-octave mailing list >> >> > help-oct...@octave.org >> >> > https://mailman.cae.wisc.edu/listinfo/help-octave >> >> > >> >> > >> >> >> >> Hi, >> >> I see no inconsistencies. >> >> 1. phi(X) is a scalar, but the argument X is a vector, phi: R^n -> R. >> >> The gradient of a scalar function phi, is a vector itself, the vector >> >> G. >> >> 2. The hessian is the jacobian of the gradient of phi, again, it >> >> depends on the vector X and it returns a matrix: The hessian H. >> >> >> >> If you refer as inconsistencies to the fact that equality constraints >> >> are defined by the vector function g and inequalities by h, and then >> >> the gradient is called G and the hessian H. Well, strictly speaking g, >> >> and G are different symbols, but maybe it can be made more clear. I do >> >> not known. >> >> >> >> did I address your problem at all? >> >> >> >> >> >> -- >> >> M. Sc. Juan Pablo Carbajal >> >> ----- >> >> PhD Student >> >> University of Zürich >> >> http://ailab.ifi.uzh.ch/carbajal/ >> > >> > >> > >> > -- >> > John E. Pearson >> > >> > >> >> >> >> -- >> M. Sc. Juan Pablo Carbajal >> ----- >> PhD Student >> University of Zürich >> http://ailab.ifi.uzh.ch/carbajal/ > > > > -- > John E. Pearson > >
Hi John, I finally understood your confusion (CC to the help and maintainers mailing list). The examples provided in the doc string uses "G" and "H" to denominate Gradient and Hessian, but they *are not* the arguments G and H of the function (though the docstring uses @var{g} and @var{h}) The argument you use to provide gradient and hessian of the cost function is the one you use to provide the cost function itself. Lets say you cost is a function called phi and the gradient gphi. Lets also assume your constraints are in a function called constraints. Then you call sqp like this x = sqp (x0, {@phi, @gphi}, @constraints) For the maintainers: Maybe is good to change the docstring. Lines 86 and 95 of sqp.m to something like (example for line 86) 85 ## @example 86 ## Dphi = gradient (@var{x}) 87 ## @end example 88 ## 89 ## @noindent 90 ## in which @var{x} is a vector and Dphi is a vector. -- M. Sc. Juan Pablo Carbajal ----- PhD Student University of Zürich http://ailab.ifi.uzh.ch/carbajal/ ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Octave-dev mailing list Octave-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/octave-dev