On 4/30/13 10:27 AM, sebb wrote: > On 30 April 2013 17:28, Phil Steitz <[email protected]> wrote: > >> On 4/29/13 9:40 AM, Luc Maisonobe wrote: >>> Hi all, >>> >>> Since 2.x series, we have been struggling in several areas with respect >>> to algorithms API. The latest change was about optimizer, but it is only >>> one example among others (solvers, integration, ODE and maybe some parts >>> of statistics may be concerned by the proposal below). >>> >>> The various things we want to keep and which are not always compatible >>> with each others are : >>> >>> 1) simple use >>> 2) immutability >>> 3) good OO design >>> 4) compatible with reference algorithms implementations >>> 5) maintainable >>> 6) extensible >>> 7) backward compatibility >>> 8) probably many other characteristics ... >>> >>> 3) and 4) often don't work together. 1) 6) and 7) are difficult to >>> handle at once. >>> >>> If we look at optimizers, some progress have been with optimizers with >>> respect to extensibility and backward compatibility, but simple use was >>> clearly left behind as it is difficult to know which optimizer support >>> which feature as neither strong typing nor fixed arguments are used >>> anymore. However, keeping the older API would have prevented >>> extensibility as the combinatorial explosion of arguments increases as >>> features are added (and we still need to add several constraints types). >>> >>> If we look at ODE solvers, we are still using the original API from >>> mantissa, but when we add a new feature, we add more and more setters, >>> thus going farther and farther from immutability, and imposing some >>> unwritten scheduling between calls (for example when we set up >>> additional equations, we must also set up the initial additional state, >>> and the user must set up a way to retrieve the final additional state). >>> >>> If we look at solvers, we started with some parameters set up during the >>> call to solve while other were set up at construction time, but this >>> repartition has changed along time. >>> >>> So I would like to suggest a new approach, which has been largely >>> inspired by a recent discussion on the [CSV] component about the builder >>> API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older >>> discussion on [math] about using fluen API for vectors (see >>> <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone >>> gave last year at ApacheCon Europe. The idea is to use fluent API to >>> build progressively the algorithm adding features one at a time using >>> withXxx methods defined in interfaces. >>> >>> As an example, consider just a few features used in optimization: >>> constraints, iteration limit, evaluations limits, search interval, >>> bracketing steps ... Some features are used in several optimizers, some >>> are specific to univariate solvers, some can be used in a family of >>> solvers ... Trying to fit everything in a single class hierarchy is >>> impossible. We tried, but I don't think we succeeded. >>> >>> If we consider separately each features, we could have interfaces >>> defined for each one as follows: >>> >>> interface Constrainable<T extends Constrainable<T>> >>> extends Optimizer { >>> /** Returns a new optimizer, handling an additional constraint. >>> * @param c the constraint to add >>> * @return a new optimizer handling the constraint >>> * (note that the instance itself is not changed >>> */ >>> T withConstraint(Constraint c); >>> } >>> >>> Basically they would be used where OptimizationData is used today. >>> An optimizer that supports simple bounds constraints and max iterations >>> would be defined as : >>> >>> public class TheOptimizer >>> implements Optimizer, >>> Constrainable<TheOptimizer>, >>> IterationLimited<TheOptimizer> { >>> >>> private final int maxIter; >>> private final List<Constraint> constraints; >>> >>> // internal constructor used for fluent API >>> private TheOptimizer(..., int maxIter, List<Constraint> list) { >>> ... >>> this.maxIter = m; >>> this.constraints = l; >>> } >>> >>> public TheOptimizer withConstraint(Constraint c) { >>> List<Constraint> l = new ArrayList<Constraint>(constraints); >>> l.add(c); >>> return new TheOptimizer(..., maxIter, l); >>> } >>> >>> public TheOptimizer withMaxIter(int maxIter m) { >>> return new TheOptimizer(..., m, constraints); >>> } >>> >>> } >>> >>> So basically, the withXxx are sort-of setters, but they do preserve >>> immutability (we do not return this, we return a new object). It is easy >>> to add features to existing classes and there is no need to shove >>> everythin within a single hierarchy, we have a forest, not a tree. When >>> looking at the API, users clearly see what the can use and what they >>> cannot use: if an optimizer does not support constraint, there will be >>> no way to put a constraint into it. If in a later version constraints >>> become available, the existing functions will not be changed, only new >>> functions will appear. >>> >>> Of course, this creates a bunch of intermediate objects, but they are >>> often quite small and the setting part is not the most >>> computation-intensive one. It becomes also possible to do some >>> parametric studies on some features, using code like: >>> >>> Algorithm core = new Algorithm().withA(a).withB(b).withC(c); >>> for (double d = 0.0; d < 1.0; d += 0.001) { >>> Algorithm dSpecial = core.withD(d); >>> double result = dSpecial.run(); >>> System.out.println(" d = " + d + ", result = " + result); >>> } >>> >>> This would work for someone considering feature A is a core feature that >>> should be fixed but feature D is a parameter, but this would equally >>> well work for someone considering the opposite case: they will simply >>> write the loop the other way, the call to withD being outside of the >>> loop and the call to withA being insided the loop. >>> >>> A side effect is also that it becomes possible to copy safely algorithms >>> by just resetting a feature, even when we don't really know what >>> implementation we have. A typical example I have that creates problems >>> to me is duplicating an ODE solver. It cannot be done currently, as some >>> specific elements are required at construction time that depend on the >>> exact type of solver you use (tolerance vectors for adaptive stepsize >>> integrators). So if for example I want to do some Monte-Carlo analysis >>> in parallel and need to duplicate an integrator, >>> I would do it as follows: >>> >>> void FirstOrderIntegrator[] >>> duplicate(FirstOrderIntegrator integrator, int n) { >>> FirstOrderIntegrator copies = new FirstOrderIntegrator[n]; >>> for (int i = 0; i < n; ++i) { >>> copies[i] = >>> integrator.withMaxEvaluations(integrator.getMaxEvaluations()); >>> } >>> return copies; >>> } >>> >>> This kind of API could be extended to several algorithms, so it may be >>> set up in a consistend way accross the library. As I wrote at the >>> beginning of this message, I first think about root solvers, optimizers >>> and ODE. >>> >>> What do you think? >> I don't have experience implementing this pattern, so don't know >> what best practices / pitfalls there may be. IIRC, what you are >> getting is formal immutability and more compact code due to withXxx >> inline instead of setXxx at the expense of lots of extra instance >> creation. I guess the latter is not something to worry much about >> in today's world. We just have to be careful with the pattern in >> the cases where constructors actually do something. What is not >> crystal clear to me is what exactly you get in flexibility beyond >> what you would get just adding setters ad hoc (as the withXxx stuff >> effectively does). It is interesting as well to consider the >> reasons that we favor immutability and whether or not this approach >> actually improves anything (e.g., concurrency: yes, helps; path / >> state complexity: not so much). >> >> > Huh? state complexity is surely much reduced if fields cannot be changed > after instance construction.
By - naive, syntactical - definition, yes. But what is important is the overall state / path complexity of whatever app is using this stuff. It is not clear to me that spreading that mutability over lots of intermediary instances is any better than just mutating a single instance. Phil > > >> Phil >> >> >>> Luc >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
