Re: [julia-users] Optim.jl line search problems

Thomas Covert Tue, 19 Aug 2014 11:52:44 -0700

I'm seeing this same error (ERROR: assertion failed: lsr.slope[ib] < 0) 
again, and this time my gradients (evaluated at "reasonable" input values) 
match the finite difference output generated by Calculus.jl's "gradient" 
function.  The function I am trying to minize is globally convex (its a 
multinomial logit log-likelihood).


I encounter this assertion error after a few successful iterations of BFGS 
and it is caused by NAN's in the gradient of the test point.  BFGS gets to 
this
test point because the step size it passes to hz_linesearch eventually gets 
to be large, and a big enough step can cause floating point errors in the 
calculation of the the derivatives.  For example, on a recent minimization 
attempt, the assertion error happens when "c" (the step size passed by bfgs 
to hz_linesearch) appears to be about 380.

I think this is happening because hz_linesearch (a) expands the step size 
by a factor of 5 (see line 280 in hz_linesearch) until it encounters upward 
movement and (b) passes this new value (or a moving average of it) back to 
the caller (i.e., bfgs).  So, the next time bfgs calls hz_linesearch, it 
starts out with a potentially large value for the first step.

I don't really know much about line search routines, but is this way things 
ought to be?  I would have thought that for each new call to a line search 
routine, the step size should reset to a default value.

By the way, is it possible to enable display of the internal values of "c" 
in the line search routines?  It looks like there is some debugging code in 
there but I'm not sure how to turn it on.

-thom


On Wednesday, July 30, 2014 6:24:26 PM UTC-5, John Myles White wrote:
>
> I’ve never seen our line search methods produce an error that wasn’t 
> caused by errors in the gradient. The line search methods generally only 
> work with function values and gradients, so they’re either buggy (which 
> they haven’t proven to be) or they’re brittle to errors in function 
> definitions/gradient definitions.
>
> Producing better error message would be great. I once started to do that, 
> but realized that I needed to come back to fully understanding the line 
> search code before I could insert useful errors. Would love to see 
> improvements there.
>
>  — John
>
> On Jul 30, 2014, at 3:17 PM, Thomas Covert <thom....@gmail.com 
> <javascript:>> wrote:
>
> I've done some more sleuthing and have concluded that the problem was on 
> my end (a bug in the gradient calculation, as you predicted). 
>
> Is an inaccurate gradient the only way someone should encounter this 
> assertion error?  I don't know enough about line search methods to have an 
> intuition about that, but if it is the case, maybe the line search routine 
> should throw a more informative error?
>
> -Thom
>
> On Wednesday, July 30, 2014 3:44:51 PM UTC-5, John Myles White wrote:
>>
>> Would be useful to understand exactly what goes wrong if we want to fix 
>> this problem. I’m mostly used to errors caused by inaccurate gradients, so 
>> I don’t have an intuition for the cause of this problem.
>>  
>> — John
>>
>> On Jul 30, 2014, at 10:45 AM, Thomas Covert <thom....@gmail.com> wrote:
>>
>> No, I haven't tried that yet - might someday, but I like the idea of 
>> running julia native code all the way...  
>>
>> However, I did find that manually switching the line search routine to 
>> "backtracking_linesearch!" did the trick, so at least we know the problem 
>> isn't in Optim.jl's implementation of BFGS itself!
>>
>> -thom
>>
>> On Wednesday, July 30, 2014 12:43:16 PM UTC-5, jbeginner wrote:
>>>
>>> This is not really a solution for this problem but have you tried the 
>>> NLopt library? From my experience it produces much more stable results and 
>>> because of problems like the one you describe I have switched to it. I 
>>> think there is an L-BFGS option also. Although I did not get AD to work 
>>> with it. The description for all algorithms can be seen here:
>>>
>>> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms
>>>
>>>
>>>
>>> On Wednesday, July 30, 2014 12:27:36 PM UTC-4, Thomas Covert wrote:
>>>>
>>>> Recently I've encountered line search errors when using Optim.jl with 
>>>> BFGS.  Here is an example error message
>>>>
>>>> *ERROR: assertion failed: lsr.slope[ib] < 0*
>>>>
>>>> * in bisect! at 
>>>> /pathtojulia/.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:577*
>>>>
>>>> * in hz_linesearch! at /**pathtojulia*
>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:273*
>>>>
>>>> * in hz_linesearch! at /**pathtojulia*
>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:201*
>>>>
>>>> * in bfgs at /**pathtojulia**/.julia/v0.3/Optim/src/bfgs.jl:121*
>>>>
>>>> * in optimize at /**pathtojulia*
>>>> */.julia/v0.3/Optim/src/optimize.jl:113*
>>>>
>>>> *while loading /pathtocode/code.jl, in expression starting on line 229*
>>>>
>>>>
>>>> I've seen this error message before, and its usually because I have a 
>>>> bug in my code that erroneously generates function values or gradients 
>>>> which are very large (i.e., 1e100).  However, in this case I can confirm 
>>>> that the "x" I've passed to the optimizer is totally reasonable (abs value 
>>>> of all points less than 100), the function value at that x is reasonable 
>>>> (on the order of 1e6), the gradients are  reasonable (between -100 and 
>>>> +100), and the entries in the approximate inverse Hessian are also 
>>>> reasonable (smallest abs value is about 1e-9, largest is about 7).  
>>>>
>>>>
>>>> This isn't a failure on the first or second iteration of BFGS - it 
>>>> happens on the 34th iteration.
>>>>
>>>>
>>>> Unfortunately its pretty hard for me to share my code or data at the 
>>>> moment, so I understand that it might be challenging to solve this problem 
>>>> but any advice you guys can offer is appreciated!
>>>>
>>>>
>>>> -Thom
>>>>
>>>
>>
>

Re: [julia-users] Optim.jl line search problems

Reply via email to