Re: [OctDev] sqp

Juan Pablo Carbajal Sat, 24 Sep 2011 07:57:54 -0700

On Sat, Sep 24, 2011 at 3:19 PM, John Pearson <johnepear...@gmail.com> wrote:
> Then where do the constraints g=0 and h = 0 go?
>
> On Sat, Sep 24, 2011 at 3:21 AM, Juan Pablo Carbajal <carba...@ifi.uzh.ch>
> wrote:
>>
>> Hi John,
>>
>> SQP uses the gradient of phi, a non-linear function to calculate a
>> local second order estimate of phi (Taylor series) and uses qp() to do
>> the next step of the optimization. The name sequential quadratic
>> programming comes from this fact. This is clearly stated in the
>> documentation as of >= 3.4 (If I am not wrong)
>> "The first element should point to the objective
>>     function, the second should point to a function that computes the
>>     gradient of the objective function, and the third should point to a
>>     function that computes the Hessian of the objective function.  If
>>     the gradient function is not supplied, the gradient is computed by
>>     finite differences.  If the Hessian function is not supplied, a
>>     BFGS update formula is used to approximate the Hessian."
>>
>> Check the code inside, between the lines 370 and 445, there is more or
>> less where the second order estimate is calculated.
>>
>> The arguments of the "1st derivative" and the "2nd derivative" of a
>> function are the same arguments of the function.
>> f(x) ~ f(x0) + G(f)(x0) * (x-x0) + 0.5 * (x-x0) * H(f)(x0) (x-x0)
>>
>> Regards,
>>
>> On Fri, Sep 23, 2011 at 11:36 PM, John Pearson <johnepear...@gmail.com>
>> wrote:
>> > g is an equality constraint. h is an inequality constraint.
>> > If G and H are the gradient and hessian of phi which argument's are
>> > they.
>> > The documentation makes it look as if they go in the 3rd and 4th spots
>> > which
>> > the documentation says are inequality constraints.
>> > Here is what the documentation says: "When supplied, the gradient
>> > function
>> > must be of the form
>> >>           G = gradient (X)
>> >>      in which X is a vector and G is a vector.
>> >>      When supplied, the Hessian function must be of the form
>> >>           H = hessian (X)
>> >>      in which X is a vector and H is a matrix."
>> > It saws that X is a vector and G is a vector. If they mean the gradient
>> > of
>> > phi then it ought to say the gradient of phi, but I suspect that sqp
>> > doesn't
>> > use gradient information at all. I am achieving success with it by
>> > passing
>> > an inequality constraint into the h slot (the 4th argument). I suspect
>> > that
>> > the code cannot use the gradient and hessian.
>> >
>> > On Fri, Sep 23, 2011 at 3:28 PM, Juan Pablo Carbajal
>> > <carba...@ifi.uzh.ch>
>> > wrote:
>> >>
>> >> On Fri, Sep 23, 2011 at 6:18 PM, John Pearson <johnepear...@gmail.com>
>> >> wrote:
>> >> > I'm not sure that this is the proper way to address documentation
>> >> > issues
>> >> > but
>> >> > I think the
>> >> > help page for "sqp" (sequential quadratic programming) is
>> >> > inconsistent.
>> >> > Specifically there are two inconsistent
>> >> > definitions for the 3rd and 4th arguments (G and H).
>> >> >  -- Function File: [X, OBJ, INFO, ITER, NF, LAMBDA] = sqp (X0, PHI)
>> >> >  -- Function File: [...] = sqp (X0, PHI, G)
>> >> >  -- Function File: [...] = sqp (X0, PHI, G, H)
>> >> >  -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB)
>> >> >  -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB, MAXITER)
>> >> >  -- Function File: [...] = sqp (X0, PHI, G, H, LB, UB, MAXITER,
>> >> >           TOLERANCE)
>> >> >      Solve the nonlinear program
>> >> >                min phi (x)
>> >> >                 x
>> >> >      subject to
>> >> >                g(x)  = 0
>> >> >                h(x) >= 0
>> >> >                lb <= x <= ub
>> >> >
>> >> >
>> >> > THen it says:
>> >> > When supplied, the gradient function must be of the form
>> >> >           G = gradient (X)
>> >> >      in which X is a vector and G is a vector.
>> >> >      When supplied, the Hessian function must be of the form
>> >> >           H = hessian (X)
>> >> >      in which X is a vector and H is a matrix.
>> >> >      The third and fourth arguments are function handles pointing to
>> >> >      functions that compute the equality constraints and the
>> >> > inequality
>> >> >      constraints, respectively.
>> >> > Calling G the gradient of X doesn't make any sense to me. IF X is a
>> >> > vector
>> >> > it's gradient should be a matrix in any event,
>> >> > either g is a constraint or it isn't. If it is a constraint it isn't
>> >> > a
>> >> > gradient and similarly H isn't a Hessian but an inequality
>> >> > constraint.
>> >> > WHich
>> >> > are they? They can't be both!
>> >> > Best regards and thanks for your great work in making Octave a great
>> >> > open
>> >> > source software tool.
>> >> > John
>> >> > --
>> >> > John E. Pearson
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Help-octave mailing list
>> >> > help-oct...@octave.org
>> >> > https://mailman.cae.wisc.edu/listinfo/help-octave
>> >> >
>> >> >
>> >>
>> >> Hi,
>> >> I see no inconsistencies.
>> >> 1. phi(X) is a scalar, but the argument X is a vector, phi: R^n -> R.
>> >> The gradient of a scalar function phi, is a vector itself, the vector
>> >> G.
>> >> 2. The hessian is the jacobian of the gradient of phi, again, it
>> >> depends on the vector X and it returns a matrix: The hessian H.
>> >>
>> >> If you refer as inconsistencies to the fact that equality constraints
>> >> are defined by the vector function g and inequalities by h, and then
>> >> the gradient is called G and the hessian H. Well, strictly speaking g,
>> >> and G are different symbols, but maybe it can be made more clear. I do
>> >> not known.
>> >>
>> >> did I address your problem at all?
>> >>
>> >>
>> >> --
>> >> M. Sc. Juan Pablo Carbajal
>> >> -----
>> >> PhD Student
>> >> University of Zürich
>> >> http://ailab.ifi.uzh.ch/carbajal/
>> >
>> >
>> >
>> > --
>> > John E. Pearson
>> >
>> >
>>
>>
>>
>> --
>> M. Sc. Juan Pablo Carbajal
>> -----
>> PhD Student
>> University of Zürich
>> http://ailab.ifi.uzh.ch/carbajal/
>
>
>
> --
> John E. Pearson
>
>


Hi John,

I finally understood your confusion (CC to the help and maintainers
mailing list).

The examples provided in the doc string uses "G" and "H" to denominate
Gradient and Hessian, but they *are not* the arguments G and H of the
function (though the docstring uses @var{g} and @var{h})

The argument you use to provide gradient and hessian of the cost
function is the one you use to provide the cost function itself. Lets
say you cost is a function called phi and the gradient gphi. Lets also
assume your constraints are in a function called constraints. Then you
call sqp like this

x = sqp (x0, {@phi, @gphi}, @constraints)

For the maintainers: Maybe is good to change the docstring. Lines 86
and 95 of sqp.m to something like (example for line 86)
85 ## @example
86 ## Dphi = gradient (@var{x})
87 ## @end example
88 ##
89 ## @noindent
90 ## in which @var{x} is a vector and Dphi is a vector.

-- 
M. Sc. Juan Pablo Carbajal
-----
PhD Student
University of Zürich
http://ailab.ifi.uzh.ch/carbajal/

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Re: [OctDev] sqp

Reply via email to