On 4/30/13 10:27 AM, sebb wrote:
> On 30 April 2013 17:28, Phil Steitz <[email protected]> wrote:
>
>> On 4/29/13 9:40 AM, Luc Maisonobe wrote:
>>> Hi all,
>>>
>>> Since 2.x series, we have been struggling in several areas with respect
>>> to algorithms API. The latest change was about optimizer, but it is only
>>> one example among others (solvers, integration, ODE and maybe some parts
>>> of statistics may be concerned by the proposal below).
>>>
>>> The various things we want to keep and which are not always compatible
>>> with each others are :
>>>
>>>  1) simple use
>>>  2) immutability
>>>  3) good OO design
>>>  4) compatible with reference algorithms implementations
>>>  5) maintainable
>>>  6) extensible
>>>  7) backward compatibility
>>>  8) probably many other characteristics ...
>>>
>>> 3) and 4) often don't work together. 1) 6) and 7) are difficult to
>>> handle at once.
>>>
>>> If we look at optimizers, some progress have been with optimizers with
>>> respect to extensibility and backward compatibility, but simple use was
>>> clearly left behind as it is difficult to know which optimizer support
>>> which feature as neither strong typing nor fixed arguments are used
>>> anymore. However, keeping the older API would have prevented
>>> extensibility as the combinatorial explosion of arguments increases as
>>> features are added (and we still need to add several constraints types).
>>>
>>> If we look at ODE solvers, we are still using the original API from
>>> mantissa, but when we add a new feature, we add more and more setters,
>>> thus going farther and farther from immutability, and imposing some
>>> unwritten scheduling between calls (for example when we set up
>>> additional equations, we must also set up the initial additional state,
>>> and the user must set up a way to retrieve the final additional state).
>>>
>>> If we look at solvers, we started with some parameters set up during the
>>> call to solve while other were set up at construction time, but this
>>> repartition has changed along time.
>>>
>>> So I would like to suggest a new approach, which has been largely
>>> inspired by a recent discussion on the [CSV] component about the builder
>>> API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older
>>> discussion on [math] about using fluen API for vectors (see
>>> <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone
>>> gave last year at ApacheCon Europe. The idea is to use fluent API to
>>> build progressively the algorithm adding features one at a time using
>>> withXxx methods defined in interfaces.
>>>
>>> As an example, consider just a few features used in optimization:
>>> constraints, iteration limit, evaluations limits, search interval,
>>> bracketing steps ... Some features are used in several optimizers, some
>>> are specific to univariate solvers, some can be used in a family of
>>> solvers ... Trying to fit everything in a single class hierarchy is
>>> impossible. We tried, but I don't think we succeeded.
>>>
>>> If we consider separately each features, we could have interfaces
>>> defined for each one as follows:
>>>
>>>   interface Constrainable<T extends Constrainable<T>>
>>>             extends Optimizer {
>>>     /** Returns a new optimizer, handling an additional constraint.
>>>       * @param c the constraint to add
>>>       * @return a new optimizer handling the constraint
>>>       * (note that the instance itself is not changed
>>>       */
>>>     T withConstraint(Constraint c);
>>>   }
>>>
>>> Basically they would  be used where OptimizationData is used today.
>>> An optimizer that supports simple bounds constraints and max iterations
>>> would be defined as :
>>>
>>>   public class TheOptimizer
>>>          implements Optimizer,
>>>                     Constrainable<TheOptimizer>,
>>>                     IterationLimited<TheOptimizer> {
>>>
>>>     private final int maxIter;
>>>     private final List<Constraint> constraints;
>>>
>>>     // internal constructor used for fluent API
>>>     private TheOptimizer(..., int maxIter, List<Constraint> list) {
>>>       ...
>>>       this.maxIter     = m;
>>>       this.constraints = l;
>>>     }
>>>
>>>     public TheOptimizer withConstraint(Constraint c) {
>>>       List<Constraint> l = new ArrayList<Constraint>(constraints);
>>>       l.add(c);
>>>       return new TheOptimizer(..., maxIter, l);
>>>     }
>>>
>>>     public TheOptimizer withMaxIter(int maxIter m) {
>>>       return new TheOptimizer(..., m, constraints);
>>>     }
>>>
>>>  }
>>>
>>> So basically, the withXxx are sort-of setters, but they do preserve
>>> immutability (we do not return this, we return a new object). It is easy
>>> to add features to existing classes and there is no need to shove
>>> everythin within a single hierarchy, we have a forest, not a tree. When
>>> looking at the API, users clearly see what the can use and what they
>>> cannot use: if an optimizer does not support constraint, there will be
>>> no way to put a constraint into it. If in a later version constraints
>>> become available, the existing functions will not be changed, only new
>>> functions will appear.
>>>
>>> Of course, this creates a bunch of intermediate objects, but they are
>>> often quite small and the setting part is not the most
>>> computation-intensive one. It becomes also possible to do some
>>> parametric studies on some features, using code like:
>>>
>>>   Algorithm core = new Algorithm().withA(a).withB(b).withC(c);
>>>   for (double d = 0.0; d < 1.0; d += 0.001) {
>>>     Algorithm dSpecial = core.withD(d);
>>>     double result = dSpecial.run();
>>>     System.out.println(" d = " + d + ", result = " + result);
>>>   }
>>>
>>> This would work for someone considering feature A is a core feature that
>>> should be fixed but feature D is a parameter, but this would equally
>>> well work for someone considering the opposite case: they will simply
>>> write the loop the other way, the call to withD being outside of the
>>> loop and the call to withA being insided the loop.
>>>
>>> A side effect is also that it becomes possible to copy safely algorithms
>>> by just resetting a feature, even when we don't really know what
>>> implementation we have. A typical example I have that creates problems
>>> to me is duplicating an ODE solver. It cannot be done currently, as some
>>> specific elements are required at construction time that depend on the
>>> exact type of solver you use (tolerance vectors for adaptive stepsize
>>> integrators). So if for example I want to do some Monte-Carlo analysis
>>> in parallel and need to duplicate an integrator,
>>> I would do it as follows:
>>>
>>>   void FirstOrderIntegrator[]
>>>       duplicate(FirstOrderIntegrator integrator, int n) {
>>>     FirstOrderIntegrator copies = new FirstOrderIntegrator[n];
>>>     for (int i = 0; i < n; ++i) {
>>>       copies[i] =
>>>         integrator.withMaxEvaluations(integrator.getMaxEvaluations());
>>>     }
>>>     return copies;
>>>   }
>>>
>>> This kind of API could be extended to several algorithms, so it may be
>>> set up in a consistend way accross the library. As I wrote at the
>>> beginning of this message, I first think about root solvers, optimizers
>>> and ODE.
>>>
>>> What do you think?
>> I don't have experience implementing this pattern, so don't know
>> what best practices / pitfalls there may be.  IIRC, what you are
>> getting is formal immutability and more compact code due to withXxx
>> inline instead of setXxx at the expense of lots of extra instance
>> creation.  I guess the latter is not something to worry much about
>> in today's world.  We just have to be careful with the pattern in
>> the cases where constructors actually do something.  What is not
>> crystal clear to me is what exactly you get in flexibility beyond
>> what you would get just adding setters ad hoc (as the withXxx stuff
>> effectively does).  It is interesting as well to consider the
>> reasons that we favor immutability and whether or not this approach
>> actually improves anything (e.g., concurrency: yes, helps; path /
>> state complexity: not so much).
>>
>>
> Huh? state complexity is surely much reduced if fields cannot be changed
> after instance construction.

By - naive, syntactical - definition, yes. But what is important is
the overall state / path complexity of whatever app is using this
stuff.  It is not clear to me that spreading that mutability over
lots of intermediary instances is any better than just mutating a
single instance.

Phil
>
>
>> Phil
>>
>>
>>> Luc
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to