Re: [Math] LeastSquaresOptimizer Design

Ole Ersoy Mon, 21 Sep 2015 17:55:49 -0700

Hola,

On 09/21/2015 04:15 PM, Gilles wrote:

Hi.


On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:

On 09/20/2015 05:51 AM, Gilles wrote:

On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:

Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
General Optimizer) design.  For example with the
LevenbergMarquardtOptimizer we would do:
`LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`

Rough optimize() outline:
public static void optimise() {
//perform the optimization
//If successful
    c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
//If not successful


c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
diagnostic);
//or


c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
diagnostic)
//etc
}

The diagnostic, when turned on, will contain a trace of the last N
iterations leading up to the failure.  When turned off, the Diagnostic
instance only contains the parameters used to detect failure. The
diagnostic could be viewed as an indirect way to log optimizer
iterations.

WDYT?


I'm wary of having several different ways to convey information to the
caller.

It would just be one way.


One way for optimizer, one way for solvers, one way for ...


Yes I see what you mean, but I think on a whole it will be worth it to add 
additional sugar code that removes the need for exceptions.

But the caller may not be the receiver
(It could be).  The receiver would be an observer attached to the
OptimizationContext that implements an interface allowing it to observe
the optimization.


I'm afraid that it will add to the questions of what to put in the
code and how.  [We already had sometimes heated discussions just for
the IMHO obvious (e.g. code formatting, documentation, exception...).]


Hehe.  Yes I remember some of these discussions.  I wonder how much time was 
spent debating the exceptions alone?  Surely everyone must have had this 
feeling in pit of their stomach that there's got to be a better way.  On the 
exception topic, these are some of the issues:

I18N
===================
If you are new to commons math and thinking about designing a commons math 
compatible exception you should probably understand the I18N stuff that's bound 
to exception (and wonder why it's bound the the exception).  Grab a coffee and 
spend a few hours, unless you are obviously fairly new to Java like some ofthe 
people posting for help.  In this case when the exception occurs, there is 
going to be a lot of tutoring going on on the users list.

Number of Exceptions
===================
Before you do actually design a new exception, you should probably see if there 
is an exception that already fits the category of what you are doing.  So you 
start reading.  Exception1...nop 
Exception2...nop...Exception3...Exception999..But I think I'm getting warmer.  
OK - Did not find it ... but I'm fairly certain that there is a elegant place 
for it somewhere in the exception hierarchy...


Handling of Exceptions
===================
If our app uses several of the commons math classes (That throw exceptions of 
the same type), and one of those classes throws an exception,what is the app 
supposed to do?

I think most developers would find that question somewhat challenging.  There 
are numerous strategies.  Catch all exceptions and log what happened, etc.  But 
what if the requirement is that if an exception is thrown, the organization 
that receives it has 0 seconds to get to the root cause of it and understand 
the dynamics. Is this doable?  (Yes obviously, but how hard is it...?).

It seems that the reporting interfaces could quickly overwhelm
the "actual" code (one type of context per algorithm).

There would one type of Observer interface per algorithm.  It would
act on the solution and what are currently exceptions, although these
would be translated into enums.


Unless I'm mistaken, the most common use-case for codes implemented
in a library such as CM is to provide a correct answer or bail out
in a non-equivocal way.

Most java developers are used to synchronous coding...call the method get the 
response...catch the exception if needed.  This is changing with JDK8, and as 
we evolve and start using lambdas, we become more accustomed to the functional 
callback style of programming.  Personally I want to be able to use an API that 
gives me what I need when everything works as expected, allows me to resolve 
unexpected issues with minimal effort, and is as simple, fluid, and lightweight 
as possible.


It would make the code more involved to handle a minority of
(undefined) cases. [Actual examples would be welcome in order to
focus the discussion.]


Rough Outline (I've evolved the concept and moved away from the 
OptimizationContext in the process of writing):

interface LevenbergMarquardtObserver {

    public void hola(Solution s);
    public void sugarHoneyIceTea(ResultType rt, Dianostics d);
}

public class LMObserver implements LevenbergMarquardtObserver {

   private Application application;

   public LMObserver(Application application) {
       this.application = application;
   }

   public void hola(ResultType rt, Solution s) {
                application.next(solution);
   }

   public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
       if (rt == ResultType.I_GOT_THIS_ONE) {
            //I looked at the commons unit tests for this algorithm evaluating
            //the diagnostics that shows how this failure can occur
            //I'm totally fixing this!  Steps aside!
       }
       else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
       {
           //We need our best engineers...call India.
       }
  )


public class Application {
    //Note nothing is returned.
    LevenberMarquardtOptimizer.setOberver(new 
LMObserver(this)).setLeastSquaresProblem(new 
ClassThatImplementsTheProblem())).start();

    public void next(Solution solution) {

        //Do cool stuff.

    }
}

Or an asynchronous variation:

public class Application {
//This call will not block because async is true
    LevenberMarquardtOptimizer.setAsync(true).setOberver(new 
LMObserver()).setLeastSquaresProblem(new 
ClassThatImplementsTheProblem())).start();

    //Do more stuff right away.

    public void next(Solution solution) {
        //When the thread running the optimization is done, this method is 
called back.
        //Do whatever comes next
    }
}

The above would start the optimization in a separate thread that does not / 
SHOULD NOT share data with the main thread.

The current reporting is based on exceptions, and assumes that if no
exception was thrown, then the user's request completed successfully.

Sure - personally I'd much rather deal with something similar to an
HTTP status code in a callback, than an exception .  I think the code
is cleaner and the calback makes it more elegant to apply an adaptive
approach to handling the response, like slightly relaxing constraints,
convergence parameters, etc.  Also by getting rid of the exceptions,
we no longer depend on the I18N layer that they are tied to and now
the messages can be more informative, since they target the root
cause.  The observer can also run in the 'main' thread' while the
optimization can run asynchronously.  Also WRT JDK9 and modules,
loosing the exceptions would mean one less dependency when the library
is up into JDK9 modules...which would be more in line with this
philosophy:
https://github.com/substack/browserify-handbook#module-philosophy


I'm not sure I fully understood the philosophy from the text in this
short paragraph.
But I do not agree with the idea that the possibility to quickly find
some code is more important than standards and best practices.


If you go to npmjs.org and type in Neural Network you will get 56 results all 
linked to github repositories.

In addition there's meta data indicating number of downloads in the last day, 
last month, etc.  Try typing in cosine.  Odds are you will find a package that 
does just want you want and nothing else.  This is very underwhelming and 
refreshing in terms of cloning off of github and getting familar with tests 
etc.  Also eye opening.  How many of us knew that we could do that much stuff 
with cosine! :).

I totally agree that in some circumstances, more information on the
inner working of an algorithm would be quite useful.

... Algorithm iterations become unit testable.


But I don't see the point in devoting resources to reinvent the wheel:

You mean pimping the wheel?  Big pimpin.


I think that logging statements are easy to add, not disruptive at all,
and come in handy to understand a code's unexpected behaviour.
Assuming that a "logging" feature is useful, it can be added *now* using
a dependency towards a weight-less (!) framework such as "slf4j".
IMO, it would be a waste of time to implement a new communication layer
that can do that, and more, if it would be used for logging only in 99%
of the cases.

SLF4J is used by almost every other framework, so why not use it? Logging and 
the diagnostic could be used together.  The primary purpose of the diagnostic 
though is to collect data that will be useful in `sugarHoneyIceTea`.


I longed several times for the use of a logging library.
The only show-stopper has been the informal "no-dependency" policy...

JDK9 Jigsaw should solve dependency hell, so the less coupling
between commons math classes the better.


I wouldn't call "coupling" the dependency towards exception classes:
they are little utilities that can make sense in various parts of the
library.


If for example the Simplex solver is broken off into it's own module, then it 
has to be coupled to the exceptions, unless it is exception free.


[Unless one wants to embark on yet another discussion about exceptions;
whether there should be one class for each of the "messages" that exist
in "LocalizedFormats"; whether localization should be done in CM;
etc.]


I think it would be best to just eliminate the exceptions.

Anyways I'm obviously
interested in playing with this stuff, so when I get something up into
a repository I'll to do a callback :).


If you are interested in big overhauls, there is one that gathered
relative consensus: rewrite the algorithms in a "multithread-friendly"
way.

I think that's a tall order that will take us into JDK88 :).  But using 
callbacks and making potentially long running computations asynchronous could 
be a middle ground that would allow simple multi threaded use without fiddling 
around under the hood...


Some ideas were floated (cf. ML archive) but no implementation or
experiment...  Perhaps with a well-defined goal such as performance
improvement, your design suggestions will become clearer to more people.

AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
ready to be used with the "java.util.concurrent" framework.

FWIU Neural Nets are a great fit for concurrency.  I think for the others we 
will end up having discussions around how users would control the number of 
threads, etc. again that makes some of us nervous.  An asynchronous operation 
that runs in one separate thread is easier to reason about.  If we want to test 
10 neural net configurations, and we have 10 cores, then we can start each by 
itself by doing something like:

Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
//Now do 10 more
//If the observer is shared then notifications should be thread safe.

Cheers,
- Ole

P.S. Dang that was a long email.  If I write one more of these, ban me :)



Best regards,
Gilles


Cheers,
Ole



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] LeastSquaresOptimizer Design

Reply via email to