Re: [all] Amazon Corretto
It's a lot harder to do that now without seriously tarnishing your brand. M$ is doing some very good things - VSCode, RxJS ... Successful companies are seeing the light and the ones that are not are very quickly forgotten about ... On 11/14/18 12:33 PM, Eric Barnhill wrote: It reminds me uncomfortably of Microsoft's old "embrace, extend, exterminate" philosophy in the 1990s. On Wed, Nov 14, 2018 at 10:03 AM Pascal Schumacher wrote: Isn't this basically the same as Adopt Open JDK: https://adoptopenjdk.net or am I missing something? -Pascal Am 14.11.2018 um 15:14 schrieb Rob Tompkins: Curious to see what people’s thoughts are to this: https://aws.amazon.com/corretto/ -Rob - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [all] Amazon Corretto
That totally rocks! Ole On 11/14/18 9:50 AM, Mark Struberg wrote: One more option. Which is good for the Java ecosystem. LieGrue, strub Am 14.11.2018 um 15:14 schrieb Rob Tompkins : Curious to see what people’s thoughts are to this: https://aws.amazon.com/corretto/ -Rob - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: Java 8 curiousity
Are you using streams? A while back I experimented with matrix multiplication using streams and they can be very slow: https://stackoverflow.com/questions/35037893/java-8-stream-matrix-multiplication-10x-slower-than-for-loop So I would think that Java has a lot of performance tuning and optimization it could do in this space. Cheers, Ole On 10/09/2017 04:20 PM, Rob Tompkins wrote: Hey all, At my day job we saw a 60% performance improvement (cpu utilization) between 1.8.0_40 and 1.8.0_121, and I was wondering if anyone else out there has seen anything like that before or if anyone might know what could cause that given that the release notes don’t directly point to anything. Cheers, -Rob - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [MATH] what a commons-TLP split could look like
On 06/21/2016 08:07 AM, Jochen Wiedmann wrote: On Tue, Jun 21, 2016 at 2:54 PM, Ralph Goerswrote: Maybe. That could depend on whether there is anyone at Commons that would want to participate in the component. Another option is to follow the pattern used by httpclient. I believe they took the last version of commons httpclient and left a pointer to it in their new web site. But what followed as a TLP was completely different and wasn’t binary compatible. The community seemed to be completely fine with that. Quite right. But I definitely cannot imagine anyone desiring three development lines (Hipparchus, a fork outside Commons, and a component inside). There are many people that desire that: https://github.com/substack/browserify-handbook#module-philosophy Having one kitchen per chef allows everyone to see what is being cooked up without stepping on toes. Just prepare your dish and show it. If other people like it, then feel free to eat, add a little salt, whatever. What we desire is to see great code get finished and the best and simplest means / tools for managing that. Having many parallel lines of development means ideas get prototyped and evaluated faster. Here's another fork: https://github.com/firefly-math/firefly-math Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [crypto][chimera] Next steps
Hi Benedikt, I think it would be better for the projects health if it uses github issues only. Cheers, Ole On 02/20/2016 05:15 AM, Benedikt Ritter wrote: Hi, I'd like to discuss the next steps for moving the Chimera component to Apache Commons. So far, none of the other PMC members has expressed his or her thoughts about this. If nobody brings up objections about moving the component to Apache Commons, I'm assuming lazy consensus about this. So the next steps would be: - decide on a name for the new component (my proposal was Apache Commons Crypto) - move code to an Apache repo (probably git?!) - request a Jira project - setup maven build - setup project website - work on an initial release under Apache Commons coordinates Anything missing? Regards, Benedikt - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] Maven expert needed...
The project as a whole should probably begin to consider using markdown documents / the github wiki / github pages as a replacement for the documentation build step. More loosely coupled documentation is easier to contribute to. Github markdown documents can be edited directly on github. Most repositories now have a central README.md that links to various supporting documents or sub component repositories. Cheers, Ole On 02/13/2016 09:59 AM, Brent Worden wrote: Gilles, Did you ever get this figured out? I will try to find sometime to investigate this weekend if you still need assistance. Thanks, Brent On Feb 5, 2016 6:24 PM, "Gilles"wrote: ... to fix the "src/userguide" in order to be able to compile the examples. I get compilation errors like: ---CUT--- [ERROR] /home/gilles/devel/java/apache/commons-math/trunk/src/userguide/java/org/apache/commons/math4/userguide/FastMathTestPerformance.java [19,32] cannot find symbol symbol: class PerfTestUtils location: package org.apache.commons.math4 [ERROR] /home/gilles/devel/java/apache/commons-math/trunk/src/userguide/java/org/apache/commons/math4/userguide/FastMathTestPerformance.java:[664,54] package PerfTestUtils does not exist ---CUT--- It seems related to not finding the "commons-math4-4.0-SNAPSHOT-tools.jar" file, although it has been created: $ ls target/*jar target/commons-math4-4.0-SNAPSHOT-tools.jar target/commons-math4-4.0-SNAPSHOT.jar Also, I think that the naming of the "tools" JAR might be problematic. Isn't the version supposed to come after the complete name (in order to be able to fill the "" tags in the userguide's "pom.xml")? Thanks, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] Maven expert needed...
On 02/13/2016 11:29 AM, Gilles wrote: On Sat, 13 Feb 2016 10:40:41 -0600, Ole Ersoy wrote: The project as a whole should probably begin to consider using markdown documents / the github wiki / github pages as a replacement for the documentation build step. More loosely coupled documentation is easier to contribute to. Github markdown documents can be edited directly on github. Most repositories now have a central README.md that links to various supporting documents or sub component repositories. Here, the problem is running Java code (that happens to be stored in the "userguide" part of the repository). I also think that should be split of as a separate project in a separate repository. For example: https://github.com/spring-projects/spring-data-examples If the long time maintainers are having issues that require an expert how are potential future contributors going to feel? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [all] apologies
Both Phil and Gilles could have handled this more surgically from the start. A great solution to this is easily within the emotional and technical ability of both of them. To take the view that this is game over is non sense. Ole On 02/08/2016 06:45 PM, Niall Pemberton wrote: On Mon, Feb 8, 2016 at 8:13 PM, Phil Steitzwrote: I am sorry for the bad tone of my recent posts here. Not the nicest way to leave and I am sorry for that. I dont think you have anything to apologize for and its disappointing to see you leave. I can understand that you've had enough of the ranting and griping against you and think thats a collective failure of the PMC to let that fester for too many years. Niall Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] ConvergenceChecker
Hi Evan, On 02/08/2016 08:10 AM, Evan Ward wrote: Hi Ole, On 02/05/2016 06:40 PM, Ole Ersoy wrote: On 02/05/2016 04:42 PM, Evan Ward wrote: Yes, I use it. In some cases it is useful to watch the RMS residuals So if it were modularized and supported logging then this might satisfy the same requirement? I'm not sure if I understand what you mean by logging, One of the reasons I'm refactoring and modularizing (Making it a standalone jar) the LM optimizer is that I'd like to be able to watch certain steps in action, in the event that issues crop up. So since you mentioned that you were 'watching' the residuals I assumed it had a similar purpose that aligned well with logging. [...] Has there ever been a case where the 'standard' convergence approach has been insufficient? I think this depends on what is included in the standard convergence checker. I think 90% of uses could be handled by watching the change in cost or state. I like the option of specifying my own condition, so I can control exactly when the algorithm stops. If it's useful to you then I'm sure it's useful to other as well. Just want to make sure that you definitely need more flexibility than what comes with relaxing or tightening the relative and absolute tolerance parameters. I'm also curious whether the cases that you do need flexibility for could be parameterized in such a way that it makes it simpler to write up to user documentation? Also could you please look at this: public static LeastSquaresProblem countEvaluations(final LeastSquaresProblem problem, final Incrementor counter) { return new LeastSquaresAdapter(problem) { /** {@inheritDoc} */ @Override public Evaluation evaluate(final RealVector point) { counter.incrementCount(); return super.evaluate(point); } // Delegate the rest. }; } Should this exist? Looks useful for counting evaluations, but I think all of the LS optimizers already do this. Anyone have a use case for countEvaluations? I think you are right. I think it's code that was accidentally left in...Anyone...? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] How fast is fast enough?
The one thing that's really ticking me off is that I realize that Phil's grasp of probability is a lot fresher than my own. Anyways don't worry about it. I hope we can piece all of this back together, because I have a lot more stuff to annoy some of you with. Cheers, Ole On 02/06/2016 05:31 AM, James Carman wrote: Okay, folks, this is definitely getting out of hand. Let's put a moratorium on this thread for the weekend or something and try to come back together next week and try to move forward. I would urge folks to watch this while we wait: https://m.youtube.com/watch?v=rOWmrlft2FI p.s. Phil, I do hope you'll reconsider. On Fri, Feb 5, 2016 at 10:47 PM Phil Steitzwrote: OK, I give up. I am withdrawing as volunteer chair or member of the new TLP. Phil On 2/5/16 7:23 PM, Gilles wrote: Phil, You talk again about me trying to push forward changes that serve no purpose besides "trash performance and correctness". This is again baseless FUD to which I've already answered (with detailed list of facts which you chose to ignore). You declare anything for which you don't have an answer as "bogus argument". Why is the reference to multi-threaded implementations bogus? You contradict yourself in pretending that CM RNGs could be so good as to make people want to use them while refusing to consider whether another design might be better suited to such high(er)-performance extensions. This particular case is a long shot but if any and all discussions are stopped dead, how do you imagine that we can go anywhere? As you could read from experts, micro-benchmarks are deceiving; but you refuse to even consider alternative designs if there might be a slight suspicion of degradation. How can we ever set up a constructive discussion on how to make everybody enjoy this project if the purported chair is so bent to protecting existing code rather than nurture a good relationship with developers who may sometimes have other ideas? I'm trying to improve the code (in a dimension which you can't seem to understand unfortunately) but respectfully request data points from those users of said code, in order to be able to prove that no harm will be done. But you seem to prefer to not disclose anything that would get us closer to agreement (better design with similar performance and room for improvement, to be discussed together as a real development team -- Not you requiring, as a bad boss, that I bow to your standards for judging usefulness). This 1% which you throw at me, where does it come from? What does 1% mean when the benchmark shows standard deviations that vary from 4 to 26% in the "nextInt" case and from 3 to 7% in the "nextGaussian" case? This 1% looks meaningless without context; context is what I'm asking in order to try and establish objectively whether another design will have a measurable impact on actual tasks. I'm not going to show any "damaged" benchmark because of how unwelcome you make me feel every time I wish to talk about other aspects of the code. There is no development community here. Only solitary coders who share a repository. Not sorry for the top-post, Gilles On Fri, 5 Feb 2016 17:07:16 -0700, Phil Steitz wrote: On 2/5/16 12:59 PM, Gilles wrote: On Fri, 5 Feb 2016 06:50:10 -0700, Phil Steitz wrote: On 2/4/16 3:59 PM, Gilles wrote: Hi. Here is a micro-benchmark report (performed with "PerfTestUtils"): - nextInt() (calls per timed block: 200, timed blocks: 100, time unit: ms) name time/call std dev total time ratio cv difference o.a.c.m.r.JDKRandomGenerator 1.088e-05 2.8e-06 2.1761e+03 1.000 0.26 0.e+00 o.a.c.m.r.MersenneTwister 1.024e-05 1.5e-06 2.0471e+03 0.941 0.15 -1.2900e+02 o.a.c.m.r.Well512a 1.193e-05 4.4e-07 2.3864e+03 1.097 0.04 2.1032e+02 o.a.c.m.r.Well1024a 1.348e-05 1.9e-06 2.6955e+03 1.239 0.14 5.1945e+02 o.a.c.m.r.Well19937a 1.495e-05 2.1e-06 2.9906e+03 1.374 0.14 8.1451e+02 o.a.c.m.r.Well19937c 1.577e-05 8.8e-07 3.1542e+03 1.450 0.06 9.7816e+02 o.a.c.m.r.Well44497a 1.918e-05 1.4e-06 3.8363e+03 1.763 0.08 1.6602e+03 o.a.c.m.r.Well44497b 1.953e-05 2.8e-06 3.9062e+03 1.795 0.14 1.7301e+03 o.a.c.m.r.ISAACRandom 1.169e-05 1.9e-06 2.3375e+03 1.074 0.16 1.6139e+02 - where "cv" is the ratio of the 3rd to the 2nd column. Questions are: * How meaningful are micro-benchmarks when the timed operation has a very small duration (wrt e.g. the duration of other machine instructions that are required to perform them)? It is harder to get good benchmarks for shorter duration activities, but not impossible. One thing that it would be good to do is to compare these results with JMH [1]. I was expecting insights based on the benchmark which I did run. You asked whether or not benchmarks are meaningful when the task being benchmarked is short duration. I answered that question. We have a tool in CM; if it's wrong, we should remove it. How
[math] ConvergenceChecker
Hi, The LeastSquaresProblem supports dropping in a custom ConvergenceChecker checker. Does anyone do this? Can you help me better understand the value of it? TIA, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] LevenbergMarquardt Evaluation Lazy vs. Unlazy
On 02/05/2016 04:52 AM, Gilles wrote: On Thu, 4 Feb 2016 18:56:20 -0600, Ole Ersoy wrote: On 02/04/2016 04:13 PM, Gilles wrote: On Thu, 4 Feb 2016 14:10:39 -0600, Ole Ersoy wrote: Hi, Has anyone performed any benchmarking on lazy vs. unlazy Evaluation(s) Someone did: https://issues.apache.org/jira/browse/MATH-1128 or is there some obvious criteria on when to use one vs. the other? I only see getResiduals() being called once in the optimize() method right after a new evaluation is created: current = problem.evaluate(new ArrayRealVector(currentPoint)); currentResiduals = current.getResiduals().toArray(); Thoughts? The problem is "getJacobian()", called only in the outer loop. Method "evaluate" is also called in an inner loop where only the residuals are used. So if the optimizer is supplied with individual function implementations that are called to calculate residuals and the jacobian matrix 'on demand / when needed' then the question of whether to use a lazy evaluation vs. the regular evaluation goes away (I think without any drawbacks)? The two functions were separate in the previous design, then grouped in the current one because it was reported that it is often is the case that both are computed at the same time. I'm still in the process of scanning through, but I think it's better if all the optimizers parameters and functions are grouped on a single OptimizationContext instance that then provides or calculates values on demand. So if we need residuals, we ask for them: double[] residuals = context.residuals(point); double[][] jacobian = context.jacobian(point); So the grouping was deemed a simplification. Spent a while scratching my noodle when I saw it ... It is, but for use-cases where they are not computed together and the objective function is costly, performance can suffer badly (i.e. not just milliseconds...). My opinion is that this (Providing multiple implementations of aggregated operations) makes uses cases more complex both from the a client api users perspective and the core developer's perspective. I have not looked across the board at all use cases yet (Developer perspective), but having a single context that provides values on demand I believe will be simpler in all cases. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] ConvergenceChecker
On 02/05/2016 04:42 PM, Evan Ward wrote: Yes, I use it. In some cases it is useful to watch the RMS residuals So if it were modularized and supported logging then this might satisfy the same requirement? , in other cases to watch the change in the states. I think it is there from an acknowledgement that we can't enumerate all possible convergence criteria, Has there ever been a case where the 'standard' convergence approach has been insufficient? Also could you please look at this: public static LeastSquaresProblem countEvaluations(final LeastSquaresProblem problem, final Incrementor counter) { return new LeastSquaresAdapter(problem) { /** {@inheritDoc} */ @Override public Evaluation evaluate(final RealVector point) { counter.incrementCount(); return super.evaluate(point); } // Delegate the rest. }; } Should this exist? Thanks, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] LevenbergMarquardt ParameterValidator
Hi, I'm attempting to understand when and how I would want to use the ParameterValidator WRT LevenbergMarquardt problem construction. Anyone have any stories they can share? TIA, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] LevenbergMarquardt ParameterValidator
On 02/04/2016 04:30 PM, Gilles wrote: On Thu, 4 Feb 2016 15:08:30 -0600, Ole Ersoy wrote: Hi, I'm attempting to understand when and how I would want to use the ParameterValidator WRT LevenbergMarquardt problem construction. Anyone have any stories they can share? It is possible that the optimizer wants to try a certain value for an optimized parameter that entails a constraint on another: the validator can attempt to reconcile them. It might be better to return a different point rather than fail. So how do I recognize this scenario and put in place a ParameterValidator that is appropriate? What is the event that causes the validate method to be triggered? Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] LevenbergMarquardt Evaluation Lazy vs. Unlazy
On 02/04/2016 04:13 PM, Gilles wrote: On Thu, 4 Feb 2016 14:10:39 -0600, Ole Ersoy wrote: Hi, Has anyone performed any benchmarking on lazy vs. unlazy Evaluation(s) Someone did: https://issues.apache.org/jira/browse/MATH-1128 or is there some obvious criteria on when to use one vs. the other? I only see getResiduals() being called once in the optimize() method right after a new evaluation is created: current = problem.evaluate(new ArrayRealVector(currentPoint)); currentResiduals = current.getResiduals().toArray(); Thoughts? The problem is "getJacobian()", called only in the outer loop. Method "evaluate" is also called in an inner loop where only the residuals are used. So if the optimizer is supplied with individual function implementations that are called to calculate residuals and the jacobian matrix 'on demand / when needed' then the question of whether to use a lazy evaluation vs. the regular evaluation goes away (I think without any drawbacks)? Ole Gilles TIA, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] LeastSquaresProblem countEvaluations?
Hi, The LM optimize method counts evaluations. The below method also does. Just wanted to check in to see whether it's still supposed to? public static LeastSquaresProblem countEvaluations(final LeastSquaresProblem problem, final Incrementor counter) { return new LeastSquaresAdapter(problem) { /** {@inheritDoc} */ @Override public Evaluation evaluate(final RealVector point) { counter.incrementCount(); return super.evaluate(point); } // Delegate the rest. }; } Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] LevenbergMarquardt Evaluation Lazy vs. Unlazy
Hi, Has anyone performed any benchmarking on lazy vs. unlazy Evaluation(s) or is there some obvious criteria on when to use one vs. the other? I only see getResiduals() being called once in the optimize() method right after a new evaluation is created: current = problem.evaluate(new ArrayRealVector(currentPoint)); currentResiduals = current.getResiduals().toArray(); Thoughts? TIA, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] [POLL] new TLP name
Apache Epsilon On 02/01/2016 03:55 PM, Bruce Johnson wrote: Apache Epsilon On Feb 1, 2016, at 12:06 PM, Phil Steitzwrote: Please select your top choice among the following suggested names for the new [math]-based TLP. All are welcome and encouraged to respond. This POLL will be open for 72 hours, at which time two tallies will be presented: one among those who have volunteered for the new PMC and a second among all respondents. Hopefully, one name will emerge as consensus winner. If not, I will kick off another runoff poll among the top choices. Please respond with your top choice for the name. AjaMa Epsilon Erdos Euclid Euler Gauss JAML Math MathBlocks MathComponents (or Math Components) Mathelactica MathModules Megginson modMath Nash Newton Pythagoras - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Name of the new TLP
I like any name that is simple (It's also good if it has a nice ring to it). If we are hoping to incorporate more modules from other projects then perhaps 'apache-davinci'? I like 'apache-epsilon' as well. On 01/25/2016 05:40 AM, sebb wrote: On 25 January 2016 at 09:28, luc <l...@spaceroots.org> wrote: Le 2016-01-25 08:52, Benedikt Ritter a écrit : Hi, I very much like the idea of taking the name of a famous mathematician. In which case it has to be Euclid or Pythagoras (early) or Paul Erdős - https://en.wikipedia.org/wiki/Erd%C5%91s_number and everyone has heard of John Nash (Beautiful Mind) etc. In the spirit of recent discussions, how about a RNG to pick the mathematician's name for each next incarnation? ;-) If it has to be some thing more descriptive: Apache Commons HttpClient went to Apache HttpComponents. How about Apache Math Components as TLP name? I quite like Apache Epsilon as a non-descriptive but related name. [ducks behind bikshed] Benedikt 2016-01-25 8:40 GMT+01:00 Ole Ersoy <ole.er...@gmail.com>: Umbrella-ish is good. Linear algebra, genetic algorithms, neural networks, clustering, monte carlo, simplex...These need an umbrella. Some of the other Apache projects that do math may be interested in moving that piece under the Apache Math umbrella. Personally I like to see each in a separate repository dedicated to the subject, along with the corresponding documentation, etc So: apache-math (Central repository describing the project as a whole with the documentation that cuts across modules) apache-math-linear-real apache-math-linear-field apache-math-optimization-genetic apache-math-optimization-simplex etc. And hopefully: apache-math-optimization-integer apache-math-optimization-mixed And more.. Cheers, Ole On 01/24/2016 04:41 PM, Phil Steitz wrote: On Jan 24, 2016, at 3:17 PM, Gilles <gil...@harfang.homelinux.org> wrote: Just plain and simple "Apache Math" maybe? Or is it taken already? It's not taken; but I thought it was too broad-sounding and in fact umbrella-ish. There are other ASF projects that do math-relates things. I think adding "components" makes it look more like a library of base components that other math-related projects can use. Phil Gilles On Sun, 24 Jan 2016 14:46:17 -0700, Phil Steitz wrote: On 1/24/16 2:16 PM, James Carman wrote: I guess it depends on the scope of what the new TLP is going to do. This is slightly jumping the gun, as we do have the opportunity in forming the new TLP to revisit the initial goals of [math]; but I suspect that initially at least we will mostly continue to be a general-purpose Java math library, trying to provide IP-clean, easily integrated, standard algorithm-based solutions to common math programming problems. We have grown to the point where we will almost certainly break the distribution up into separate "components." No umbrella, but likely multiple release artifacts. Similar in some ways to what happened with [http], which is why I suggested the same approach to naming. Regarding picking a mathematician for the name, I don't much like that idea as whoever you choose, you end up loading some math area and / or cultural bias into the name. Phil Umbrella projects aren't that popular these days, from what I understand. Maybe an homage to a famous mathematician? Apache Newton? Apache Euler? Apache Euclid? On Sun, Jan 24, 2016 at 4:08 PM Phil Steitz <phil.ste...@gmail.com> wrote: We need to agree on a name. My own preference is for a boring, descriptive name, but I am manifestly not a marketing guy, so won't be offended if others want to be more creative. My suggestion is MathComponents I would be happy with either Math Components or Math. Also I do favor fancy acronyms that read exactly as well known names (23 years ago, the name of my first mathematics library was an acronym that read "Cantor"), it is probably not a good idea for this new TLP. In any case, the project will most probably be de facto an umbrella project as modularizing it seems in the current mood. best regards, Luc Hearkens back to HttpComponents, which has worked pretty well. Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.
Re: [math] Name of the new TLP
The general pattern that I'm seeing within the NodeJS community is that big projects are being broken up into smaller projects and those smaller projects are being broken up yet again, etc. What I find refreshing about that is that it's pretty simple to find a component / module that I need for which I can read the example and get up to speed in 5 minutes and if needed read the source code in less than an hour. In contrast I'm spending a fairly significant amount of time attempting to break up CM into smaller, simpler, more easily digestible pieces. Once something becomes a big library it's very hard to manage it elegantly. Small projects are easy to manage. For example I'm combining Twitter Bootstrap and SUIT-CSS into a framework called superfly-css. If you search for it on npmjs all the modules pop up (A combination of tools and css components): https://www.npmjs.com/search?q=superfly-css Each module links back to the parent module for usage documentation, install, design guidelines, etc. The superfly-css organization landing page is going to outline all the tool and component modules. Each module can have a gh-pages branch containing the site for the module if necessary. I have a very easy time managing this and anyone who uses it gets precisely what they need. It's like ordering Sushi :) and the process for using the modules is almost that simple. This is the best argument I have seen for taking a modular approach: https://github.com/substack/browserify-handbook#module-philosophy As an example directly related to CM have a look at the now fairly complete firefly-math-exceptions module: https://github.com/firefly-math/firefly-math-exceptions Anyone on this list can probably read the source in about 10 minutes. It's very easy to integrate / extend this into any (CM or non CM) math project. So if we create an organization structure targeting small simple modules I think we will have a lot more fun with this. Cheers, Ole On 01/25/2016 07:47 AM, Phil Steitz wrote: On 1/25/16 12:40 AM, Ole Ersoy wrote: Umbrella-ish is good. Linear algebra, genetic algorithms, neural networks, clustering, monte carlo, simplex...These need an umbrella. Some of the other Apache projects that do math may be interested in moving that piece under the Apache Math umbrella. The ASF does not look favorably on "umbrella" projects. This is because in these projects, the individual volunteers making up the PMC inevitably lose sight of the full project. The governance model that we have at Apache has no layers in it beneath the PMC. That means PMCs need to be focused. "All things X" PMCs don't work. The canonical example of that was Jakarta, which started as "all things Java" and was eventually split up. We should definitely not try to be "all things math" at the ASF. A better focus would be a nice set math components in Java that other math-related projects inside and outside the ASF can use. Kind of like, um, Commons Math as its own TLP :) Phil Personally I like to see each in a separate repository dedicated to the subject, along with the corresponding documentation, etc So: apache-math (Central repository describing the project as a whole with the documentation that cuts across modules) apache-math-linear-real apache-math-linear-field apache-math-optimization-genetic apache-math-optimization-simplex etc. And hopefully: apache-math-optimization-integer apache-math-optimization-mixed And more.. Cheers, Ole On 01/24/2016 04:41 PM, Phil Steitz wrote: On Jan 24, 2016, at 3:17 PM, Gilles <gil...@harfang.homelinux.org> wrote: Just plain and simple "Apache Math" maybe? Or is it taken already? It's not taken; but I thought it was too broad-sounding and in fact umbrella-ish. There are other ASF projects that do math-relates things. I think adding "components" makes it look more like a library of base components that other math-related projects can use. Phil Gilles On Sun, 24 Jan 2016 14:46:17 -0700, Phil Steitz wrote: On 1/24/16 2:16 PM, James Carman wrote: I guess it depends on the scope of what the new TLP is going to do. This is slightly jumping the gun, as we do have the opportunity in forming the new TLP to revisit the initial goals of [math]; but I suspect that initially at least we will mostly continue to be a general-purpose Java math library, trying to provide IP-clean, easily integrated, standard algorithm-based solutions to common math programming problems. We have grown to the point where we will almost certainly break the distribution up into separate "components." No umbrella, but likely multiple release artifacts. Similar in some ways to what happened with [http], which is why I suggested the same approach to naming. Regarding picking a mathematician for the name, I don't much like that idea as whoever you choose, you end up loading some math area and / or cultural bias into th
Re: [math] Name of the new TLP
Also if each module is very simple and isolated alphas, betas, etc. matter less (If at all). Most devs releasing to npm rely on semver only. Cheers, Ole On 01/25/2016 02:27 PM, Gary Gregory wrote: On Jan 25, 2016 10:11 AM, "Emmanuel Bourg"wrote: Le 25/01/2016 18:52, Gilles a écrit : AFAICT, the real issue is one of policy: Commons is supposed to be stable, stable, stable and stable (IIUC). And CM is far from being mature as a programming project, when considering design and scope, and not only the quality of its results and performance (which are both good in many cases). So stability (as in using JDK 5 only) is not a good perspective (surely not developers and probably not for users either IMO). If this does not change, what's the point indeed? I hope that a motivation behind the TLP isn't to break the compatibility on every release, otherwise this will quickly turn into a nightmare for the users. Bouncycastle plays this game and it isn't really fun to follow :( WRT compatibility, the only thing that matters is not creating jar hell for users. You can break compatibility if you change package and maven coordinates. It's up to the project to create enough alphas and betas to get to a stable public API before a release. That's just basic project management IMO. Anything less will leave a lot users unhappy. Gary Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Name of the new TLP
It's very easy to create one, but I think we should focus on small high quality simple to use modules and let third parties provide assemblies. I think most of us will feel better about providing solutions that explicitly declare the modules used. This gives maintainers a more precise target. Keeping an uber jar also gets more and more difficult to maintain with each new module release. For example with superfly-css I change modules all the time and I'm planning on adding a lot of new ones. If I also add a uber module I have to maintain that as well. That's not my main concern though. I think an uber jar / module can easily cause headaches. It's the opposite of allowing JDK 9 or osgi manage dependencies and corresponding contexts. The less indirection the better. Cheers, Ole On 01/25/2016 06:38 PM, Gary Gregory wrote: If you decide to break up math into modules, I encourage you to also provide an all-in-one jar. Gary On Jan 25, 2016 4:22 PM, "Ole Ersoy" <ole.er...@gmail.com> wrote: Also if each module is very simple and isolated alphas, betas, etc. matter less (If at all). Most devs releasing to npm rely on semver only. Cheers, Ole On 01/25/2016 02:27 PM, Gary Gregory wrote: On Jan 25, 2016 10:11 AM, "Emmanuel Bourg" <ebo...@apache.org> wrote: Le 25/01/2016 18:52, Gilles a écrit : AFAICT, the real issue is one of policy: Commons is supposed to be stable, stable, stable and stable (IIUC). And CM is far from being mature as a programming project, when considering design and scope, and not only the quality of its results and performance (which are both good in many cases). So stability (as in using JDK 5 only) is not a good perspective (surely not developers and probably not for users either IMO). If this does not change, what's the point indeed? I hope that a motivation behind the TLP isn't to break the compatibility on every release, otherwise this will quickly turn into a nightmare for the users. Bouncycastle plays this game and it isn't really fun to follow :( WRT compatibility, the only thing that matters is not creating jar hell for users. You can break compatibility if you change package and maven coordinates. It's up to the project to create enough alphas and betas to get to a stable public API before a release. That's just basic project management IMO. Anything less will leave a lot users unhappy. Gary Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Name of the new TLP
I like Apache Math as well. Cheers, Ole On 01/24/2016 04:18 PM, James Carman wrote: I'm okay with that too. Apache Math On Sun, Jan 24, 2016 at 5:17 PM Gilleswrote: Just plain and simple "Apache Math" maybe? Or is it taken already? Gilles On Sun, 24 Jan 2016 14:46:17 -0700, Phil Steitz wrote: On 1/24/16 2:16 PM, James Carman wrote: I guess it depends on the scope of what the new TLP is going to do. This is slightly jumping the gun, as we do have the opportunity in forming the new TLP to revisit the initial goals of [math]; but I suspect that initially at least we will mostly continue to be a general-purpose Java math library, trying to provide IP-clean, easily integrated, standard algorithm-based solutions to common math programming problems. We have grown to the point where we will almost certainly break the distribution up into separate "components." No umbrella, but likely multiple release artifacts. Similar in some ways to what happened with [http], which is why I suggested the same approach to naming. Regarding picking a mathematician for the name, I don't much like that idea as whoever you choose, you end up loading some math area and / or cultural bias into the name. Phil Umbrella projects aren't that popular these days, from what I understand. Maybe an homage to a famous mathematician? Apache Newton? Apache Euler? Apache Euclid? On Sun, Jan 24, 2016 at 4:08 PM Phil Steitz wrote: We need to agree on a name. My own preference is for a boring, descriptive name, but I am manifestly not a marketing guy, so won't be offended if others want to be more creative. My suggestion is MathComponents Hearkens back to HttpComponents, which has worked pretty well. Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrix power requires square matrix?
Never mind on the below - I was thinking scalar operations ... Cheers, Ole On 01/20/2016 04:53 PM, Ole Ersoy wrote: Hi, The RealMatrix.power(p) (http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/linear/RealMatrix.html#power%28int%29) says that it will throw a NON_SQUARE_MATRIX_EXCEPTION. Is this correct? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealMatrix power requires square matrix?
Hi, The RealMatrix.power(p) (http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/linear/RealMatrix.html#power%28int%29) says that it will throw a NON_SQUARE_MATRIX_EXCEPTION. Is this correct? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] TLP
... and it looks like watch notifications for these are now enabled. Issues are still going through JIRA though. Cheers, Ole On 01/13/2016 08:16 PM, Gary Gregory wrote: Commons projects that use Git like Math and Lang are already mirrored on GitHub, See: https://github.com/apache/commons-math https://github.com/apache/commons-lang Gary On Wed, Jan 13, 2016 at 6:04 PM, Ole Ersoy <ole.er...@gmail.com> wrote: I love the idea. I also think commons will get a lot more eye balls if it gets all the repositories on github and enables the watch button as well as github issues. Cheers, Ole On 01/13/2016 07:24 PM, Gary Gregory wrote: I like having [math] in Commons. There are other multi-module projects in Commons, that's not an issue IMO, just good project design. My main worry is more on the overall health of Commons or perception that [math] is "leaving" Commons, the more eyeballs on Commons the better. Gary On Wed, Jan 13, 2016 at 4:50 PM, Phil Steitz <phil.ste...@gmail.com> wrote: I would like to propose that we split [math] out into a top level project at the ASF. This has been proposed before, and I have always come down on the side of staying in Commons, but I am now convinced that it is a good step for us to take for the following reasons: 0) We have several committers who are really only interested in [math], so being on the Commons PMC does not really make sense for them 1) The code base has swollen in size to well beyond the "small sharp tools" that make up the bulk of Commons 2) We are probably at the point where we should consider splitting [math] itself into separately released subcomponents (could be done in Commons, but starts smelling a little Jakarta-ish when Commons has components with subcomponents). The downsides are a) [newPMC] loses Commons eyeballs / contributors who would not find us otherwise b) Migration / repackaging pain c) Overhead of starting and managing a PMC d) Other Commons components lose some eyeballs Personally, I think the benefits outweigh the downsides at this point. New better tools and ASF processes have made b) and c) a little less onerous. I don't think d) is really a big problem for Commons, as those of us who work on other stuff here could continue to do so. It is possible that a) actually works in the reverse direction - i.e., we are easier to find as a TLP. What do others think about this? Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] TLP
I love the idea. I also think commons will get a lot more eye balls if it gets all the repositories on github and enables the watch button as well as github issues. Cheers, Ole On 01/13/2016 07:24 PM, Gary Gregory wrote: I like having [math] in Commons. There are other multi-module projects in Commons, that's not an issue IMO, just good project design. My main worry is more on the overall health of Commons or perception that [math] is "leaving" Commons, the more eyeballs on Commons the better. Gary On Wed, Jan 13, 2016 at 4:50 PM, Phil Steitzwrote: I would like to propose that we split [math] out into a top level project at the ASF. This has been proposed before, and I have always come down on the side of staying in Commons, but I am now convinced that it is a good step for us to take for the following reasons: 0) We have several committers who are really only interested in [math], so being on the Commons PMC does not really make sense for them 1) The code base has swollen in size to well beyond the "small sharp tools" that make up the bulk of Commons 2) We are probably at the point where we should consider splitting [math] itself into separately released subcomponents (could be done in Commons, but starts smelling a little Jakarta-ish when Commons has components with subcomponents). The downsides are a) [newPMC] loses Commons eyeballs / contributors who would not find us otherwise b) Migration / repackaging pain c) Overhead of starting and managing a PMC d) Other Commons components lose some eyeballs Personally, I think the benefits outweigh the downsides at this point. New better tools and ASF processes have made b) and c) a little less onerous. I don't think d) is really a big problem for Commons, as those of us who work on other stuff here could continue to do so. It is possible that a) actually works in the reverse direction - i.e., we are easier to find as a TLP. What do others think about this? Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Should this throw a NO_DATA exception?
HI, On 01/09/2016 06:21 AM, Gilles wrote: [...] But we should know the target of the improvement. I mean, is it a drop-in replacement of the current "RealVector"? OK - I think it's probably confusing because I posted JDK8 examples earlier. I'm just wondering whether the current RealVector norm methods should throw a no data exception? I think they should. If so, how can it happen before we agree that Java 8 JDK can be used in the next major release of CM? At some point I'm sure CM will switch over, so we can start experimenting with features now. If it's a redesign, maybe we should define a "wish list" of what properties should belong to which concept. I think that is a good inclusive approach for a community. My primary wishes are: - Remove inheritance when possible in order to keep it simple (Possibly at the expense of generic use) - Design classes that are focused on doing small simple things - Modularize (I could list all the benefits, but I think we know them). The longer CM takes to do this the harder it will be. Every single time someone sprinkles it FastMath it gets a little harder... So in general just keep it simple. If it needs to support other requirements then: Reuse operations from a FunctionXXX class. Support new forms of state in a new module. E.g. for a "matrix" it might be useful to be mutable (as per previous discussions on the subject), I think the approach here should be very strict. For example ArrayRealVector has almost half the code dedicated to mutation that can easily be done elsewhere. I think this was done because CM is not modular. We can't defer to an array module for the manipulations so they had to be baked into ArrayRealVector. but for a (geometrical) vector it might be interesting to not be (as in the case for "Vector3D" in the "geometry" package). The "matrix" concept probably requires a more advanced interface in order to allow efficient implementations of basic operations like multiplication... Yes - For example when multiplying a sparce matrix times a sparce vector? Or a normal vector times a sparce matrix? Etc. I'm hoping there's a very simple way to accomplish this outside of using inheritance. There is a issue on the bug-tracking system that started to collect many of the various problems (specific and general) of data containers ("RealVector", "RealMatrix", etc.) of the "o.a.c.m.linear" package. Perhaps it should more useful, for future reference, to list everything in one place. Sure - I think in this case though we can knock it out fast. Sometimes when we list everything in one place people look at it, get a headache, and start drinking :). To me it seems like a vector that is empty (no elements) is different from having a vector with 1 or more 0d entries. In the latter case, according to the formula, the norm is zero, but in the first case, is it? To be on the safe side, it should be an error, but I've just had to let this kind of condition pass (cf. MATH-1300 and related on the implemenation of "nextBytes(byte[],int,int)" feature). On Fri, 8 Jan 2016 18:41:27 -0600, Ole Ersoy wrote: public double getLInfNorm() { double norm = 0; Iterator it = iterator(); while (it.hasNext()) { final Entry e = it.next(); norm = FastMath.max(norm, FastMath.abs(e.getValue())); } return norm; } The main problem with the above is that it assumes that the elements of a "RealVector" are Cartesian coordinates. There is no provision that it must be the case, and assuming it is then in contradiction with other methods like "append". While experimenting with the design of the current implementation I ended up throwing the exception. I think it's the right thing to do. The net effect is that if someone creates a new ArrayVector(new double[]{}), then the exception is thrown, so if they don't want it thrown then they should new ArrayVector(new double[]{0}). More explanations of this design below ... I don't know at this point (not knowing the intended usage). One way to look at it is to say "Conceptually it is not correct, but we are using it in a way that eliminates this flaw, so it's OK". Which I don't think is OK, unless we can say conclusively and globally that it's OK for all users in all cases. In this case I think returning a zero norm when there is no data is wrong, and can potentially lead to wrong results. [I think this is low-level discussion that is not impacting on the design but would fixe an API at a too early stage.] Yes I see your point there. Why patch the roof if the house is getting demolished in two weeks. CM seems to be really nice to all the interested parties with respect to this though. Ubuntu provides long term supported releases. Fedora releases every six months and disconti
[math] Should this throw a NO_DATA exception?
public double getLInfNorm() { double norm = 0; Iterator it = iterator(); while (it.hasNext()) { final Entry e = it.next(); norm = FastMath.max(norm, FastMath.abs(e.getValue())); } return norm; } Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Should this throw a NO_DATA exception?
Hi Gilles, On 01/08/2016 07:37 PM, Gilles wrote: Hi Ole. Maybe I don't understand the point of the question but I'm not sure that collecting here answers to implementation-level questions is going to lead us anywhere. Well it could lead to an improved implementation :). There is a issue on the bug-tracking system that started to collect many of the various problems (specific and general) of data containers ("RealVector", "RealMatrix", etc.) of the "o.a.c.m.linear" package. Perhaps it should more useful, for future reference, to list everything in one place. Sure - I think in this case though we can knock it out fast. Sometimes when we list everything in one place people look at it, get a headache, and start drinking :). To me it seems like a vector that is empty (no elements) is different from having a vector with 1 or more 0d entries. In the latter case, according to the formula, the norm is zero, but in the first case, is it? On Fri, 8 Jan 2016 18:41:27 -0600, Ole Ersoy wrote: public double getLInfNorm() { double norm = 0; Iterator it = iterator(); while (it.hasNext()) { final Entry e = it.next(); norm = FastMath.max(norm, FastMath.abs(e.getValue())); } return norm; } The main problem with the above is that it assumes that the elements of a "RealVector" are Cartesian coordinates. There is no provision that it must be the case, and assuming it is then in contradiction with other methods like "append". While experimenting with the design of the current implementation I ended up throwing the exception. I think it's the right thing to do. The net effect is that if someone creates a new ArrayVector(new double[]{}), then the exception is thrown, so if they don't want it thrown then they should new ArrayVector(new double[]{0}). More explanations of this design below ... At first (and second and third) sight, I think that these container classes should be abandoned and replaced by specific ones. For example: * Single "matrix" abstract type or interface for computations in the "linear" package (rather than "vector" and "matrix" types) * Perhaps a "DoubleArray" (for such things as "append", etc.). And by the way, there already exists "ResizableDoubleArray" which could be a start. * Geometrical vectors (that can perhaps support various coordinate systems) * ... I think we are thinking along the same lines here. So far I have the following: A Vector interface with only these methods: - getDimension() - getEntry() - setEntry() An ArrayVector implements Vector implementation where the one and only constructor takes a double[] array argument. The vector length cannot be mutated. If someone wants to do that they have to create a new one. A VectorFunctionFactory class containing methods that return Function and BiFunction instances that can be used to perform vector mapping and reduction. For example: /** * Returns a {@link Function} that produces the lInfNorm of the vector * {@code v} . * * Example {@code lInfNorm().apply(v);} * @throws MathException * Of type {@code NO_DATA} if {@code v1.getDimension()} == 0. */ public static Function<Vector, Double> lInfNorm() { return lInfNorm(false); }; /** * Returns a {@link Function} that produces the lInfNormNorm of the vector * {@code v} . * * Example {@code lInfNorm(true).apply(v);} * * @param parallel *Whether to perform the operation in parallel. * @throws MathException * Of type {@code NO_DATA} if {@code v.getDimension()} == 0. * */ public static Function<Vector, Double> lInfNorm(boolean parallel) { return (v) -> { LinearExceptionFactory.checkNoVectorData(v.getDimension()); IntStream stream = range(0, v.getDimension()); stream = parallel ? stream.parallel() : stream; return stream.mapToDouble(i -> Math.abs(v.getEntry(i))).max().getAsDouble(); }; } So the design leaves more specialized structures like Sparce matrices to a different module. I'm not sure if this is the best design, but so far I'm feeling pretty good about it. WDYT? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Java 8 RealVector Functional Design
Hi, I'm attempting a more minimalistic array vector design and just thought I'd float a partial API to see what you think. The below methods are both 'mapToSelf' by default. If the user wants a new vector, she should first clone the vector and then call the map method (vector.clone().map(...)). public void map(BiFunctionfunction, Vector v) { Arrays.setAll(data, i -> function.apply(data[i], v.getEntry(i))); } public void parallelMap(BiFunction function, Vector v) { Arrays.parallelSetAll(data, i -> function.apply(data[i], v.getEntry(i))); } The above two functions (Left the dimension check out) allow you to "Plug in" a lambda function to perform the mapping. For example if you want to perform addition, you would use the addition BiFunction like this: public static BiFunction add = (x, y) -> { return x + y; }; RUNTIME: vector2.map(add, vector1); Then the same for subtraction, multiplication, etc. I'm thinking the static BiFunction instances can go in the Arithmetic module. That way the map methods can use both checked and unchecked arithmetic operations. I hoping that this will also make the FieldVector and RealVector implementations more efficient from a code sharing viewpoint and symmetric from an API perspective. Thoughts? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Java 8 RealVector Functional Design
Hi, This is part 2 to the below. So initially I was thinking about having a Vector with map, reduce, etc. operations on it. The map function would handle the mapping of two vectors to a third (Actually just mapping one vector onto another). The reduce function would reduce either the vector or two vectors to a single value. The dotProduct method would be an example. However once Lambda functions are implemented they could just be used standalone. For example: public static Function<Vector, Double> Norm = (v) -> { return Math.sqrt( IntStream.range(0, v.getDimension()).mapToDouble(i -> Math.pow(v.getEntry(i), 2)).sum()); }; RUNTIME: double norm = Norm.apply(v); // Faster double norm = ParallelNorm.apply(v); So it's possible to have a reduce interface on the vector like this: double = v.reduce(Norm) That does the same thing, but it's pointless and harder to explain. Also (I have not tested yet) but just by the looks of it it seems the above Norm function can be applied to any implementation of Vector. Thus if the vector interface is dead simple and only supplies getEntry(i), setEntry(i), getDimension() that's good enough for just about everything...I think...famous last words . Cheers, Ole On 01/05/2016 10:35 AM, Ole Ersoy wrote: Hi, I'm attempting a more minimalistic array vector design and just thought I'd float a partial API to see what you think. The below methods are both 'mapToSelf' by default. If the user wants a new vector, she should first clone the vector and then call the map method (vector.clone().map(...)). public void map(BiFunction<Double, Double, Double> function, Vector v) { Arrays.setAll(data, i -> function.apply(data[i], v.getEntry(i))); } public void parallelMap(BiFunction<Double, Double, Double> function, Vector v) { Arrays.parallelSetAll(data, i -> function.apply(data[i], v.getEntry(i))); } The above two functions (Left the dimension check out) allow you to "Plug in" a lambda function to perform the mapping. For example if you want to perform addition, you would use the addition BiFunction like this: public static BiFunction<Double, Double, Double> add = (x, y) -> { return x + y; }; RUNTIME: vector2.map(add, vector1); Then the same for subtraction, multiplication, etc. I'm thinking the static BiFunction instances can go in the Arithmetic module. That way the map methods can use both checked and unchecked arithmetic operations. I hoping that this will also make the FieldVector and RealVector implementations more efficient from a code sharing viewpoint and symmetric from an API perspective. Thoughts? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealVector Fluid API vs. Java 8 Arrays.setAll
Hi, RealVector supports the following type of fluid API: |RealVectorresult =v.mapAddToSelf(3.4).mapToSelf(newTan()).mapToSelf(newPower(2.3)); IIUC each time we call v.mapXXX an iteration happens, so with the above expression we loop 3 times. With Java 8 Arrays.setAll we can do the same thing like this (Using static imports): ||Arrays.setAll(arr, i -> pow(tan(arr[i] + 3.4), 2.3)); So that seems like a pretty good fit for ArrayRealVector. WDYT? I have not looked at SparseRealVector yet... Cheers, Ole | ||
Re: [math] Matrix parallel operations
[...] I am curious to see how this compares to simple for-loops which I can imagine help the JIT compiler to do loop unrolling and to make use of instruction-level parallelism. Otmar, Just read another article that evaluated some of Angelika Langers results. http://blog.codefx.org/java/stream-performance/ Here's a quote: (Don’t ask me why for the arithmetic operation streaming the array’s elements is faster than looping over them. I have been banging my head against that wall for a while.) I'm hoping that streaming is faster for Vectors than using the current built in iterator for things like scalar addition, etc. Cheers, Ole
Re: [math] RealVector Fluid API vs. Java 8 Arrays.setAll
On 01/03/2016 08:41 PM, Gilles wrote: On Sun, 3 Jan 2016 19:41:46 -0600, Ole Ersoy wrote: Hi, RealVector supports the following type of fluid API: |RealVectorresult =v.mapAddToSelf(3.4).mapToSelf(newTan()).mapToSelf(newPower(2.3)); IIUC each time we call v.mapXXX an iteration happens, so with the above expression we loop 3 times. With Java 8 Arrays.setAll we can do the same thing like this (Using static imports): ||Arrays.setAll(arr, i -> pow(tan(arr[i] + 3.4), 2.3)); So that seems like a pretty good fit for ArrayRealVector. WDYT? I have not looked at SparseRealVector yet... Cheers, Ole | || [Above message is a tad mangled.] That's weird... result = v.mapToSelf(FunctionUtils.compose(new Power(2.3), new Tan(), FunctionUtils.combine(new Add(), new Constant(3.4), new Identity(; Feasible with one loop in CM: yes. Less compact than the above syntax: yes. Less efficient than the Java8 construct: I'd guess so... So I'm trying to think out whether a more minimal RealVector design makes sense and what it would look like ... that still works in all collaborative CM vector / matrix calculation use cases. For example RealVector has an method: public abstract RealVector append(double d); This seems like it belongs in the Collections domain and can be split off (But is that going to kill kittens?). Same thing with getSubVector and setSubVector, etc. It seems like with Arrays.setAll and Arrays.parallelSetAll the mapToSelf could be performed by more universally applicable Java 8 components. Possibly the walk methods as well. WDYT? Cheers, Ole Regards, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Matrix parallel operations
On 01/03/2016 02:06 AM, Otmar Ertl wrote: Am 03.01.2016 7:49 vorm. schrieb "Ole Ersoy" <ole.er...@gmail.com>: Hi, I ran another test using a single parallel loop for array based matrix vector multiplication. Throughput almost tripled (Test pasted at bottom): # Run complete. Total time: 00:13:24 Benchmark Mode Cnt Score Error Units MultiplyBenchmark.parallelMultiplication thrpt 200 2221.682 ± 48.689 ops/s MultiplyBenchmark.singleThreadMultiplication thrpt 200 818.755 ± 9.782 ops/s public class MultiplyBenchmark { public static double[] multiplySingleThreaded(double[][] matrix, double[] vector) { return Arrays.stream(matrix) .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] * vector[col]).sum()) .toArray(); } public static double[] multiplyConcurrent(double[][] matrix, double[] vector) { return Arrays.stream(matrix).parallel() .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] * vector[col]).sum()) .toArray(); } @State(Scope.Thread) public static class Matrix { static int size = 1; static double[] vector = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; public static double[][] matrix = new double[size][10]; static { for (int i = 0; i < size; i++) { matrix[i] = vector.clone(); } } } @Benchmark public void singleThreadMultiplication(Matrix m) { multiplySingleThreaded(m.matrix, m.vector); } @Benchmark public void parallelMultiplication(Matrix m) { multiplyConcurrent(m.matrix, m.vector); } } Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-help@ <dev-h...@commons.apache.org> commons.apache.org <dev-h...@commons.apache.org> I am curious to see how this compares to simple for-loops which I can imagine help the JIT compiler to do loop unrolling and to make use of instruction-level parallelism. According to the person that helped out initially on stackoverflow the stream based loop is slightly faster. In his experiment the for loop took 100 seconds and the stream did it in 89 seconds. http://stackoverflow.com/questions/34519952/java-8-matrix-vector-multiplication Not so sure about that though. Just read up on the below article and, like you are saying, there are some tricks for making for loops very fast: https://jaxenter.com/java-performance-tutorial-how-fast-are-the-java-8-streams-118830.html Looking at the results per the article, sticking to primitives and for loops can be wildly faster than streams. Looks like I'm going to have to follow her advice and benchmark a lot :). Cheers, Ole
Re: [math] Matrix parallel operations
Hi Otmar, On 01/02/2016 10:33 AM, Otmar Ertl wrote: On Sat, Jan 2, 2016 at 4:38 AM, Ole Ersoy <ole.er...@gmail.com> wrote: Hi, Hope ya'll are having an awesome new year! Some matrix operations, like createRealIdentityMatrix can be turned into one liners like this: IntStream.range(0, dimension).forEach(i -> m.setEntry(i, i, 1.0)); And can be performed in parallel like this: IntStream.range(0, dimension).parallel().forEach(i -> m.setEntry(i, i, 1.0)); Applying that approach in general we could probably create a ParallelXxxRealMatrix fairly easily. Just thought I'd float the idea. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org Once we go for Java 8, yes. However, all these constructs have some overhead. I doubt that the naive parallelization using parallel() is faster than the sequential counterpart, especially for operations such as matrix initializations. I'm sure it depends...But just to find out whether there is any merit to it I ran the identity matrix code single threaded and parallel (With JMH), and these were the results (On my HP Envy laptop 8 cores): # Run complete. Total time: 00:14:08 Benchmark Mode Cnt Score Error Units MyBenchmark.parallelIdentity thrpt 200 18875.739 ± 400.382 ops/s MyBenchmark.singleThreadIdentity thrpt 200 11046.079 ± 130.713 ops/s So almost twice the throughput with parallel(). This was on a 20K X 20K matrix. I'll paste the test code at the bottom. A more efficient parallel implementation would require the definition of a Spliterator that divides an operation into less but larger chunks of work in order to amortize synchronization costs. In general, the implementation of efficient and well scaling parallel matrix operations requires more work than writing just a single line of code. I'm sure you are right. I'm going to test array based matrix vector multiplication next. I'm curious to see what happens once some basic operations are added. Cheers, Ole public class IdentityBenchmark { @State(Scope.Thread) public static class Identity { public int dimension = 2; public double[][] m = new double[dimension][dimension]; } @Benchmark public void singleThreadIdentity(Identity m) { IntStream.range(0, m.dimension).forEach(i -> m.m[i][i] = 1.0); } @Benchmark public void parallelIdentity(Identity m) { IntStream.range(0, m.dimension).parallel().forEach(i -> m.m[i][i] = 1.0); } } - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealVector.cosine() missing checkVectorDimension?
On 01/02/2016 04:12 AM, Gilles wrote: On Sat, 2 Jan 2016 00:50:18 -0600, Ole Ersoy wrote: I think RealVector.cosine() is missing the checkVectorDimensions(v) check? "checkVectorDimensions" is called by "dotProduct". Oh good...at least one of us knows how to read code :). Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Matrix parallel operations
Hi, I ran another test using a single parallel loop for array based matrix vector multiplication. Throughput almost tripled (Test pasted at bottom): # Run complete. Total time: 00:13:24 Benchmark Mode Cnt Score Error Units MultiplyBenchmark.parallelMultiplication thrpt 200 2221.682 ± 48.689 ops/s MultiplyBenchmark.singleThreadMultiplication thrpt 200 818.755 ± 9.782 ops/s public class MultiplyBenchmark { public static double[] multiplySingleThreaded(double[][] matrix, double[] vector) { return Arrays.stream(matrix) .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] * vector[col]).sum()) .toArray(); } public static double[] multiplyConcurrent(double[][] matrix, double[] vector) { return Arrays.stream(matrix).parallel() .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] * vector[col]).sum()) .toArray(); } @State(Scope.Thread) public static class Matrix { static int size = 1; static double[] vector = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; public static double[][] matrix = new double[size][10]; static { for (int i = 0; i < size; i++) { matrix[i] = vector.clone(); } } } @Benchmark public void singleThreadMultiplication(Matrix m) { multiplySingleThreaded(m.matrix, m.vector); } @Benchmark public void parallelMultiplication(Matrix m) { multiplyConcurrent(m.matrix, m.vector); } } Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealVector.cosine() missing checkVectorDimension?
I think RealVector.cosine() is missing the checkVectorDimensions(v) check? - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Matrix parallel operations
Hi, Hope ya'll are having an awesome new year! Some matrix operations, like createRealIdentityMatrix can be turned into one liners like this: IntStream.range(0, dimension).forEach(i -> m.setEntry(i, i, 1.0)); And can be performed in parallel like this: IntStream.range(0, dimension).parallel().forEach(i -> m.setEntry(i, i, 1.0)); Applying that approach in general we could probably create a ParallelXxxRealMatrix fairly easily. Just thought I'd float the idea. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrixFormat.parse()
On 12/31/2015 11:10 AM, Gilles wrote: On Wed, 30 Dec 2015 21:33:56 -0600, Ole Ersoy wrote: Hi, In RealMatrixFormat.parse() MatrixUtils makes the decision on what type of RealMatrix instance to return. Ideally, this is correct as the actual type is an "implementation detail". Flexibility is gained if it just returns double[][] letting the caller decide what type of RealMatrix instance to create. That could become a problem e.g. for sparse matrices where the persistent format and the instance type could be optimized for space, but a "double[][]" cannot be. RealMatrixFormat.parse() first creates a double[][] and then it drops it into the Matrix wrapper it thinks is best, per MatrixUtils. By leaving out the last step the caller can either use MatrixUtils (Or hopefully MatrixFactory) to perform the next step. Or maybe there is no next step. Perhaps just having a double[][] is fine. It's also better for modularity, as is reduces RealMatrixFormat imports (The MatrixUtils supports Field matrices as well, and I'm attempting to separate real and field matrices into two difference modules). For modularity, IO should not be in the same module as the core algorithms. I agree in general. I'm sticking all the 'Real' (Excluding Field) classes in one module (Vector and Matrix). AbstractRealMatrix uses RealMatrixFormat, so it's tightly coupled ATM and it seems like it belongs with the real Vector and Matrix classes so... Also just curious if Array2DRowRealMatrix is worth keeping? It seems like the performance of BlockRealMatrix might be just as good or better regardless of matrix size ... although my testing is limited. I recall having performed a benchmark years ago and IIRC, the "BlockRealMatrix" started to be more only for very large matrix size (although I don't remember which). That was what I was seeing as well. Once matrix rows reach 100K - 10 million performance goes up between 2X and 5X, but I did not really see any difference for (multiplication only) in performance for small data sets. So I'm assuming, like Luc indicated, that the Array2DRowRealMatrix is only better when attempting to reuse the underlying double[][] matrix a lot... Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrixFormat.parse()
On 12/31/2015 03:33 AM, Luc Maisonobe wrote: Le 31/12/2015 04:33, Ole Ersoy a écrit : [...] Of course, using this feature is rather expert use. Typically, it is done when some algorithm creates the data array by itself, and then wants to return it as a matrix, but will not use the array by itself anymore. In this case, transfering ownership of the array to the matrix instance is not a bad thing, particularly if the array is big. I agree this case is really specific so it may not be sufficient to keep this class (or to keep the constructor and the special getter). OK - I'll just leave out Array2DRealMatrix for now. The BlockRealMatrix is a real gem. Happy New Year! Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrixFormat.parse()
On 12/31/2015 05:42 PM, Gilles wrote: On Thu, 31 Dec 2015 12:54:00 -0600, Ole Ersoy wrote: On 12/31/2015 11:10 AM, Gilles wrote: On Wed, 30 Dec 2015 21:33:56 -0600, Ole Ersoy wrote: Hi, In RealMatrixFormat.parse() MatrixUtils makes the decision on what type of RealMatrix instance to return. Ideally, this is correct as the actual type is an "implementation detail". Flexibility is gained if it just returns double[][] letting the caller decide what type of RealMatrix instance to create. That could become a problem e.g. for sparse matrices where the persistent format and the instance type could be optimized for space, but a "double[][]" cannot be. RealMatrixFormat.parse() first creates a double[][] and then it drops it into the Matrix wrapper it thinks is best, per MatrixUtils. By leaving out the last step the caller can either use MatrixUtils (Or hopefully MatrixFactory) to perform the next step. Or maybe there is no next step. Perhaps just having a double[][] is fine. My opinion is that this code should be in a separate IO module. where the external format can be made more flexible and more correct (such as not doing unnecessary allocation). Totally with you on that. Ideally something along the lines of MatrixPersist and MatrixParse classes that support localized formatting. Right now it's all bundled up into RealMatrixFormat...probably due to time constraints. I'll look at modularizing that part later. Right I'm breaking up MatrixUtils into MatrixFactory and LinearExceptionFactory, and then once the dust settles I can look at the IO piece in more detail. It's also better for modularity, as is reduces RealMatrixFormat imports (The MatrixUtils supports Field matrices as well, and I'm attempting to separate real and field matrices into two difference modules). For modularity, IO should not be in the same module as the core algorithms. I agree in general. I'm sticking all the 'Real' (Excluding Field) classes in one module (Vector and Matrix). AbstractRealMatrix uses RealMatrixFormat, so it's tightly coupled ATM and it seems like it belongs with the real Vector and Matrix classes so... Given the major refactoring which you are attempting, why not drop everything that does not belong? Good point. I'll just strip out the formatting, etc. from AbstractRealMatrix and reintroduce it in the IO module. Also just curious if Array2DRowRealMatrix is worth keeping? It seems like the performance of BlockRealMatrix might be just as good or better regardless of matrix size ... although my testing is limited. I recall having performed a benchmark years ago and IIRC, the "BlockRealMatrix" started to be more only for very large matrix size (although I don't remember which). That was what I was seeing as well. Once matrix rows reach 100K - 10 million performance goes up between 2X and 5X, but I did not really see any difference for (multiplication only) in performance for small data sets. So I'm assuming, like Luc indicated, that the Array2DRowRealMatrix is only better when attempting to reuse the underlying double[][] matrix a lot... As I recall, for "small" matrices, the "Block" version was significantly slower. Depends what we call "large" and "small"... Hmm - That probably makes sense since Block has to create the block structure. I'll have a second look once I get a good profiling setup added to the module. HAPPY NEW YEAR!! Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] One more reason to add lombok
Hi, This is a getter from RealMatrixFormat: /** * Get the format prefix. * * @return format prefix. */ public String getRowPrefix() { return rowPrefix; } Within RealMatrixFormat the properties include prefix and rowPrefix. Both have the same documentation. I'm assuming that the above should say @return row prefix? With lombok the comments adjacent, so writing the documentation is easier and faster, while also making the above scenario much less likely. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealMatrixFormat.parse()
Hi, In RealMatrixFormat.parse() MatrixUtils makes the decision on what type of RealMatrix instance to return. Flexibility is gained if it just returns double[][] letting the caller decide what type of RealMatrix instance to create. It's also better for modularity, as is reduces RealMatrixFormat imports (The MatrixUtils supports Field matrices as well, and I'm attempting to separate real and field matrices into two difference modules). Also just curious if Array2DRowRealMatrix is worth keeping? It seems like the performance of BlockRealMatrix might be just as good or better regardless of matrix size ... although my testing is limited. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrixPreservingVisitor and RealMatrixChangingVisitor the same?
Hi Luc, On 12/30/2015 03:55 AM, Luc Maisonobe wrote: Le 30/12/2015 06:18, Ole Ersoy a écrit : Hi, Hi Ole, RealMatrixPreservingVisitor and RealMatrixChangingVisitor files look identical with the exception of a single @see Default... annotation (Which I think is redundant...same as > All known implementing classes...?). Would it make sense to remove the annotation and have ons RealMatrixChangingVisitor extend RealMatrixPreservingVisitor? No. They are different and used for different things. The visit method returns void in one case and double in another case. When it returns double, this double is used to update the matrix that is visited, hence the "Changing" nature of the visitor. Aha - Figured I was missing something - thanks for explaining. What do you think about removing the @see annotation (IIUC javadoc generates a link to implementing classes) and having the changing visitor extend the preserving one while overriding `visit()`? Also could you help me understand what the start() and() end methods are for? Is there some test code I can look at (I did scan BlockRealMatrixTest)? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealMatrixPreservingVisitor and RealMatrixChangingVisitor the same?
On 12/30/2015 03:28 PM, Luc Maisonobe wrote: Le 30/12/2015 20:18, Ole Ersoy a écrit : Hi Luc, On 12/30/2015 03:55 AM, Luc Maisonobe wrote: Le 30/12/2015 06:18, Ole Ersoy a écrit : Hi, Hi Ole, RealMatrixPreservingVisitor and RealMatrixChangingVisitor files look identical with the exception of a single @see Default... annotation (Which I think is redundant...same as > All known implementing classes...?). Would it make sense to remove the annotation and have ons RealMatrixChangingVisitor extend RealMatrixPreservingVisitor? No. They are different and used for different things. The visit method returns void in one case and double in another case. When it returns double, this double is used to update the matrix that is visited, hence the "Changing" nature of the visitor. Aha - Figured I was missing something - thanks for explaining. What do you think about removing the @see annotation (IIUC javadoc generates a link to implementing classes) and having the changing visitor extend the preserving one while overriding `visit()`? This would defeat the purpose of the overloaded signatures for the various walk methods in RealMatrix. There would also be an ambiguity when calling visit and ignoring the returned value: would it be a call to the void method in the super interface or a call to the new method in the lower interface? I don't even think it is possible to override something based only on the return type. Ah - You're right - thanks - I guess I could have just tried it :). Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealMatrixPreservingVisitor and RealMatrixChangingVisitor the same?
Hi, RealMatrixPreservingVisitor and RealMatrixChangingVisitor files look identical with the exception of a single @see Default... annotation (Which I think is redundant...same as > All known implementing classes...?). Would it make sense to remove the annotation and have one RealMatrixChangingVisitor extend RealMatrixPreservingVisitor? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Callback interface for Optimizer
Hi, Just thought I'd share a concrete callback interface that I think this may work well for optimizers in general, even though I'm just working on Levenberg Marquardt ATM. The interface is pasted below, but roughly here's how it works. //Create an instance of an observer implementation. OptimizationObserver observer = new ObserverImplementation(); //Drop it into the optimizer that implements Optimizer Optimizer optimizer = new WhateverOptimizer(observer, problem); //Fire it up (Note void return type) optimizer.start(1 = number of allotted iterations); Wait for the success, end, or error to be communicated through the Observer.notify() method per the optimizer.status field which is an instance of the OptimizationProtocol interface and that is implemented by an enum specific to the optimizer. If the optimizer converges, then the Optimizer.status field is set to success (Protocol.SUCCESS), and the optimizer is passed back via the OptimizationObserver.notify(Optimizer optimizer) method. The optimum result can be retrieved per the interface double[] Optimizer.getResult(). If the optimizer does not converge then optimizer status is set to Protocol.END before the optimizer is passed back to the observer. If there's an error, then the status is set to indicate the error code (Protocol.ERROR__TOO_SMALL_COST_RELATIVE_TOLERANCE). Here's the observer interface. public interface OptimizationObserver { /** * Called when optimizer converges (Protocol.SUCCESS), ends (Protocol.END), * or encounters an error. * * @param optimizer */ void notify(Optimizer optimizer); } So within the implemented notify method we would do something like: { if (optimizer.getStatus == Protocol.SUCCESS) { //Champagne!! Number[] result = optimizer.getResult(); } else if (optimizer.getStatus == Protocol.END) { //Do another round of iterations? optimizer.start(1); //or //Just use the result - It's 90% optimal. Number[] result = optimizer.getResult(); } else if (optimizer.getStatus == Protocol.BLACKHOLE_ERROR) { //This is bad - Throw in the towel } Thoughts? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealVector.isInfinite() mixes in Nan check?
On 12/26/2015 12:12 PM, Gilles wrote: On Sat, 26 Dec 2015 10:21:30 -0600, Ole Ersoy wrote: In RealVector there is an isInfinite() method that checks for isInfinite() and isNan() at the same time. If any coordinate is infinite, it will return true...unless a value is Nan...then it will return false. I'm probably missing something...but it seems like isInfinite() should return true if the 'isInfinite' condition matches, without the check for Nan mixed in? I'd think that if any component is NaN then "isInfinite" should indeed be false. But in that case, it does not mean that all components are finite... Perhaps it would be less surprising to have a method "isFinite" (no infinities and no NaNs). I think it would be good to add that and only check for elements that are infinite in isInfinite(). Cheers, Ole Regards, Gilles There exists a method that checks for Nan as well. I pasted both below: /** * Check whether any coordinate of this vector is {@code NaN}. * * @return {@code true} if any coordinate of this vector is {@code NaN}, * {@code false} otherwise. */ public abstract boolean isNaN(); /** * Check whether any coordinate of this vector is infinite and none are * {@code NaN}. * * @return {@code true} if any coordinate of this vector is infinite and * none are {@code NaN}, {@code false} otherwise. */ public abstract boolean isInfinite(); Thoughts? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] RealVector.isInfinite() mixes in Nan check?
In RealVector there is an isInfinite() method that checks for isInfinite() and isNan() at the same time. If any coordinate is infinite, it will return true...unless a value is Nan...then it will return false. I'm probably missing something...but it seems like isInfinite() should return true if the 'isInfinite' condition matches, without the check for Nan mixed in? There exists a method that checks for Nan as well. I pasted both below: /** * Check whether any coordinate of this vector is {@code NaN}. * * @return {@code true} if any coordinate of this vector is {@code NaN}, * {@code false} otherwise. */ public abstract boolean isNaN(); /** * Check whether any coordinate of this vector is infinite and none are * {@code NaN}. * * @return {@code true} if any coordinate of this vector is infinite and * none are {@code NaN}, {@code false} otherwise. */ public abstract boolean isInfinite(); Thoughts? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] RealVector.isInfinite() mixes in Nan check?
On 12/26/2015 02:41 PM, Phil Steitz wrote: On 12/26/15 11:12 AM, Gilles wrote: On Sat, 26 Dec 2015 10:21:30 -0600, Ole Ersoy wrote: In RealVector there is an isInfinite() method that checks for isInfinite() and isNan() at the same time. If any coordinate is infinite, it will return true...unless a value is Nan...then it will return false. I'm probably missing something...but it seems like isInfinite() should return true if the 'isInfinite' condition matches, without the check for Nan mixed in? I'd think that if any component is NaN then "isInfinite" should indeed be false. Right. That is what we advertise and do now. Looking at the implementation, it is a bit lazy, though, as it calls isNaN instead of just interleaving the NaN check in one pass through the array. We should fix that. But in that case, it does not mean that all components are finite... Perhaps it would be less surprising to have a method "isFinite" (no infinities and no NaNs). Do you have a use case for such a method? In fact, neither isNaN nor isInifinite is used anywhere in [math]. Could be deprecation candidates. Thumbs up for deprecating. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Thread safe RealVector
Hi, What do you think of removing iterator(), Entry, and Iterator() from RealVector? ArrayRealVector can replace these with Vector...I think...Still need to attempt it. Vector is synchronized so it makes it easier to make ArrayRealVector thread safe. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Thread safe RealVector
On 12/26/2015 05:22 PM, Gilles wrote: On Sat, 26 Dec 2015 16:21:04 -0600, Ole Ersoy wrote: Hi, What do you think of removing iterator(), Entry, and Iterator() from RealVector? ArrayRealVector can replace these with Vector...I think...Still need to attempt it. Vector is synchronized so it makes it easier to make ArrayRealVector thread safe. I don't follow at all. We're in the same boat :). I've only half baked my thinking at this point, but java.util.Vector is synchronized, uses a 0 based index, and throws ConcurrentModificationException if it is modified while it is being iterated over, so it would provide thread safety for operations like add(RealVector v). It also provides a lot of the modification methods provided by RealVector like append, getSubVector, etcAlthough they are named differently. But I think utilities that performs Matrix / Vector operations might be better(Thinking out loud). For example: Vector.add(final double[] v1 , final double[] v2) { //validate //operate //return brand new array. } That's thread safe without even trying :). But anyways, please have a look at the long-standing issue on JIRA concerning the refactoring of RealVector/RealMatrix. Will do. https://issues.apache.org/jira/browse/MATH-765 Just did. I think it's better to just skip the whole thing and separate the pieces into: 1) Structure: One dimensional or two dimensional arrays 2) Structure Manipulation: Manipulate one dimensional or two dimensional arrays 3) Operations (Like above Vector.add(), Matrix.add()) Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] MatrixDimensionMismatchException vs. DimensionMismatchException
Actually - I the Factory is used, then the key signature should determine if it's an MDME or DME. So if 4 keys are used row, column, row, column then MDME otherwise DME...Sound good? Ole On 12/24/2015 06:12 PM, Ole Ersoy wrote: So if IUC whenever we are dealing with matrices, a MDME should be thrown? So in this needs an update?: https://github.com/apache/commons-math/blob/master/src/main/java/org/apache/commons/math4/linear/RealMatrix.java#L95 Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Exception Design
Hi Gilles, It sounds like we're starting to coalesce on this. Hopefully this is not going to come as too much of a shock :). [1] The exception patch will permeate every class that throws an exception. We will have to delete all the other exception and replace them with MathException. Here's an example from the ArithmeticUtils refactoring: https://github.com/firefly-math/firefly-math-arithmetic/blob/master/src/main/java/com/fireflysemantics/math/arithmetic/Arithmetic.java#L582 Notice that the Javadoc has 2 throws declarations containing the same exception using different exception types. Luc - I think the localization design your created is as elegant as can be. As far as productivity goes the ExceptionFactory class utilization pattern remains unchanged. I feel like we just got done stacking 50 wine classes, and I'm not sure I want to move anything, unless I just throw a fork wrench at it in [1] above :). So suppose we include the MathException with localization. If anyone is a super purist (Like me) they can just remove the 6 lines of localization code. Reinstall. Done. So from a practical point of view it's about as trivial as it gets. So the only thing that bugs me about it is that others looking at CM might have the "Why is there localization code in the exception" reaction. And maybe there is a good reason for that, but having looked at the threads so far, it's not obvious to me that there is. For example say we have an application exists that throws at 10,000 different points in the App without catching. So I'm guessing either a server web app or just some app we run from the console. We get a stack trace. Most of it is in English right (Java vernacular)? So my gut reaction is that if someone gives me a technical manual in Greek, and it has one sentence in English, that's not so helpful. This is Java and Math combined. Anyone looking at it is probably very smart, very confused, or asleep. Also, the documentation will be updated to say that in order to get translated exception messages, before using CM, set this property on this class. Or it could say: --- In order to translate exception put the following try catch block at the root entry point to your application: try { Application.run() } catch(MathException e) { u.translate(e) } This has the same effect, but I think it's a better pattern for developers in general. The functionality for `u.translate(e)` can be provided in a separate module. You can do more than just translate the exception. Move it to a central logging server, etc. I also think makes CM look "Sharper" by advertising the above usage pattern, although everyone knows it's fairly amazing as is. Cheers, Ole On 12/24/2015 08:06 AM, Gilles wrote: Hi Ole. At this point in time, there are still incompatible views on the purpose of an "exception framework" in the context of a library like CM. Perhaps you could restate your proposal (on another thread?), and ask explicitly if people agree with the principle. If in the affirmative, you should then open an issue on JIRA and prepare a patch for the master branch, in order to ensure that there is no hidden problem. Thanks, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org . - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] MatrixDimensionMismatchException vs. DimensionMismatchException
So if IUC whenever we are dealing with matrices, a MDME should be thrown? So in this needs an update?: https://github.com/apache/commons-math/blob/master/src/main/java/org/apache/commons/math4/linear/RealMatrix.java#L95 Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Exception Design
[...] On 12/23/2015 03:38 AM, luc wrote: interface ExceptionLocalizer { /** Localize an exception message. * @param locale locale to use * @param me exception to localize * @return localized message for the exception */ String localize(Locale locale, MathException me); } and having ExceptionFactory hold a user-provided implementation of this interface? public class ExceptionFactory { private static ExceptionLocalizer localizer = new NoOpLocalizer(); public static setLocalizer(ExceptionLocalizer l) { localizer = l; } public static String localize(Locale locale, MathException me) { return localizer.localize(locale, me); } /** Default implementation of the localizer that does nothing. */ private static class NoOpLocalizer implements ExceptionLocalizer { /** {@inheritDoc} */ @Override public String localize(MathException me) { return me.getMessage(); } } } and MathException could implement both getLocalizedMessage() and even getMessage(Locale) by delegating to the user code: public class MathException { public String getLocalizedMessage() { return ExceptionFactory.localize(Locale.getDefault(), this); } public String getMessage(Locale locale) { return ExceptionFactory.localize(locale, this); } ... } Nice! One thing that would be nice would be that in addition to the get method, MathException also provides a getKeys to retrieve all keys and a getType to retrieve the type. Just added getKeys() (Line 48): https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/main/java/com/fireflysemantics/math/exception/MathException.java getType() already there (Line 29 - @Getter annotation produces the getter) Also added getMethodName() and getClassName() to get the source of the exception. Since there is only a single exception there is no need to unroll it to get the root cause. [...] Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Exception Design
[...] Looks good. Where is the code? ;-) So CM clients would: catch(MathException e) { String exceptionTemplate = ResourceBundle.getBundle("cm.exception.templates", new Locale("en", "US")).getString(e.getType()); String i18Nmessage = buildMessage(exceptionTemplate, e.getContext()); ... } I can prototype that out more. Just trying to get a feel for how viable the concept is first. I'm not sure I understand correctly. Does that mean that 1. Uncaught exceptions will lead to a message in English? Sort of - Right now the exception message is just the Enum 'type/code'. For example: MIA__INTEGER_OVERFLOW. Calling toString() on the exception will produce a more detailed message with the context parameters as well. 2. Every "catch" must repeat the same calls (although the arguments are likely to be the same for all clauses (and for all applications)? Well - Suppose the use case is using CM to just write quick simulations in the main block. We don't want to bother the analyst with what exceptions mean or throw catch concepts, etc. So primarily the user is concerned with the //CM code block below: public static void main(String args[]) { //The CM code... } So instead of providing the above block as a starting point this block would be provided: public static void main(String args[]) { try { //The CM code... } catch(MathException) { u.rethrowLocalized(e) } Comparing this with the current behaviour (where the translated message String is created when "e.getLocalizedMessage()" is called) is likely to make people unhappy. Hopefully they would be OK with something like the above. If a single MathException is adopted it should be very easy to create a Cheatsheet with all the exception codes that can occur in a given scenario along with what it means in a language, in addition to handling it with the type of utility describe above. I would think the goal would be to get the user to understand the exceptions as they are thrown in Java vernacular though...Even if I see an exception message in Norwegian I still need to read the code and Javadoc to figure out what it means. I'm still looking into the possibility of a custom designed annotation to do the above utility, but it may require the use of AspectJ or Apache Weaver. [...] I'd be interested in settling the matter of whether we must use a single hierarchy because of technical limitations, or if it is a good solution on its own, i.e. extending standard exceptions is not a better practice after all. I think we understand this, but anything other than a single exception is going to introduce a non trivial amount of additional effort both for users, maintainers, and documenters. Consider the ripple effect this has on other libraries using CM. We could provide a utility: public boolean isMathException(RuntimeException e) { if (e instanceof MathException) { return true; } final Throwable t = e.getCause(); if (t != null) { if (e instanceof MathException) { return true; } } return false; } Or just not wrap. Of course, but choosing one or the other is not a technical problem; it's design decision. Do we have arguments (or reference to them)? As Luke pointed out, we want to be able to catch the exception and know that it came from CM with minimal effort. If another layer is using CM, then that layer should catch the CM exception and rethrow it using a System exception providing a global facade the same way Spring does. public class ExceptionFactory { public static void throwIllegalArgumentException(MathException e) { throw new IllegalArgumentException(e); } public static void throwNullPointerException(MathException e) { throw new NullPointerException(e); } // And so on... } So, CM code would be public class Algo { public void evaluate(double value) { // Check precondition. final double min = computeMin(); if (value < min) { final MathException e = new MathException(NUMBER_TOO_SMALL).put(CONSTRAINT, min).put(VALUE, value); ExceptionFactory.throwIllegalArgumentException(e); } // Precondition OK. } } Another thing that I hinted to is that the the factory builds in the precondition check in the throw method. So that the line: if (value < min) { can be nixed. It seems nice to ensure that the exception raised is consistent with the checked condition. That's the idea. OK, but do you foresee that all precondition checks will be handle by factory methods. It would be nice. Like you said, it's also good if an exception is always produced by a globally unique condition. It would not be so nice to have explicit checks sprinkled here and there. Indeed. The single exception design should allow for a factory method for all the Enum types. If it's done right the factory should also make writing the utility for localizing messages easier. [...] Cheers, Ole
Re: [Math] Exceptions from "JDKRandomGenerator"
A few drawbacks to having IAE thrown by CM is that it complicates and blurres things for those designing a handler that catches all CM exceptions. CM advertising a factory that throws each exception 'type' under globally unique conditions minimizes root cause analysis time and indirection. This: if (n <= 0) { throw new IllegalArgumentException("n <= 0"); } Misses out on the factory benefit of closing over the condition that checks and throws the exception. It also makes the explanation for developing and using CM longer...Is it a NOT_STRICTLY_POSITIVE_EXCEPTION or IAE that actually is a NOT_STRICTLY_POSITIVE_EXCEPTION exception? If I know that it's a NOT_STRICTLY_POSITIVE_EXCEPTION then I'm one step ahead. Maybe I can simply set the argument to zero and try again, or just throw that step away and continue. If we standardizes on using Factory.checkNotStrictlyPositiveException(key, n) the client developer can also grab the key and the n value and reconstruct the message. Also, this: Factory.checkNotStrictlyPositiveException(key, n) Is easier, more semantic, less error prone, and faster to write than: if (n <= 0) { throw new IllegalArgumentException("n <= 0"); } And it provides more benefits: - Parameter name(s) - Parameter values - More semantic - Almost instant path to root cause - Exception thrown by class (One method call - no unrolling of the cause stack) - Exception thrown by method (One method call - no unrolling of the cause stack) Also, if CM modularizes, then the Factory approach standardizes exception generation and handling across the entire ecosystem. Cheers, Ole On 12/23/2015 10:39 AM, Gilles wrote: On Wed, 23 Dec 2015 16:26:52 +0100, Thomas Neidhart wrote: On 12/21/2015 04:41 AM, Gilles wrote: On Sat, 19 Dec 2015 11:35:26 -0700, Phil Steitz wrote: On 12/19/15 9:02 AM, Gilles wrote: Hi. While experimenting on https://issues.apache.org/jira/browse/MATH-1300 I created a new JDKRandomGeneratorTest that inherits from RandomGeneratorAbstractTest similarly to the classes for testing all the other RNG implemented in CM. The following tests (implemented in the base class) failed: 1. testNextIntNeg(org.apache.commons.math4.random.JDKRandomGeneratorTest) Time elapsed: 0.001 sec <<< ERROR! java.lang.Exception: Unexpected exception, expected but was 2. testNextIntIAE2(org.apache.commons.math4.random.JDKRandomGeneratorTest) Time elapsed: 0.015 sec <<< ERROR! java.lang.IllegalArgumentException: bound must be positive This is caused by try/catch clauses that expect a "MathIllegalArgumentException" but "JDKRandomGenerator" extends "java.util.Random" that for those methods throws "IllegalArgumentException". What to do? I would change the test to expect IllegalArgumentException. Most [math] generators actually throw NotStrictlyPositiveException here, which extends MIAE, which extends IAE, so this should work. It turns out that, in the master branch, the hierarchy is RuntimeException | MathRuntimeException | MathIllegalArgumentException as per https://issues.apache.org/jira/browse/MATH-853 [And the Javadoc and "throws" clauses are not yet consistent with this in all the code base (e.g. the "RandomGenerator" interface).] So, in 4.0, "JDKRandomGenerator" should probably not inherit from "java.util.Random" but delegate to it, trap standard exceptions raised, and rethrow CM ones. which is probably a good indication that the current situation in CM4 (as per MATH-853) was not a good design decision. It was consensual. What you express below is far more controversial. I applied the changes as I thought the issue was settled, but it turns out that some of its implications were not fully taken into account. From my POV, we should stick to the existing exceptions were applicable, as this is what people usually expect and is good practice. This means we should not create our own MathIAE but instead throw a standard IAE. I understand that the MIAE was created to support the localization of exception messages, but I wonder if this is really needed in that case. Usually, when an IAE is thrown it indicates a bug, as the developer did not provide proper arguments as indicated per javadoc. Now I do not see the value of being able to localize such exceptions as *only* developers should ever see them. This is a point I made a long time ago (not "in a far, far away galaxy"). To which I got the answer that CM must provide a * detailed, * localizable, * textual error message. IMO, for a bug pointed to by an IAE, all the developer has to know is the stack trace. If we want to be "standard", we shouldn't even have to check for null or array length on caller's input data as we know that the JVM will do the checks and trivially throw standard exceptions on these programming errors! Javadoc and stack trace are necessary and sufficient to fix those bugs. For any other exceptions (typically converge
[math] LevenbergMarquardt Lombok Generated Configuration Object
Hola, I started working on my LevenbergMarquardt optimizer experiment, and figured I'd share the Lombok generated immutable configuration class that I split off from the optimizer. This is only for show...Not trying to restart the lombok inclusion in CM discussion. https://github.com/firefly-math/firefly-math-levenberg-marquardt/blob/master/src/main/java/com/fireflysemantics/math/optimization/LMConfiguration.java The @Value annotation makes all the fields private and final, generates the getters, and the AllArgs constructor. I attached a static DEFAULT configuration instance that I suspect will be used 90% of the time. All other cases are handled by the AllArgs constructor. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Exception Design
On 12/22/2015 11:46 AM, Gilles wrote: Hi. On Mon, 21 Dec 2015 22:44:16 -0600, Ole Ersoy wrote: On 12/21/2015 06:44 PM, Gilles wrote: On Mon, 21 Dec 2015 12:14:16 -0600, Ole Ersoy wrote: Hi, I was considering jumping into the JDKRandomGenerator exception discussion, but I did not want to hijack it. Not sure if any of you have had a chance to looks at this: https://github.com/firefly-math/firefly-math-exceptions/ https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/main/java/com/fireflysemantics/math/exception/MathException.java I had a rapid look; unfortunately not in sufficient details to grasp the major departures from the existing framework. Could you display one or two examples comparing CM and firefly? In addition to what I summarized below one detail that I think is important is that the ExceptionTypes enum allows for more exact targeting of the exception root cause. For instance right now I have the following arithmetic exception types: /** * MATH ARITHMETIC EXCEPTIONS */ MAE("MAE"), MAE__INTEGER_OVERFLOW("MAE__INTEGER_OVERFLOW"), MAE__LONG_OVERFLOW("MAE__LONG_OVERFLOW"), MAE__OVERFLOW_IN_ADDITION("MAE__OVERFLOW_IN_ADDITION"), MAE__OVERFLOW_IN_SUBTRACTION("MAE__OVERFLOW_IN_SUBTRACTION"), MAE__GCD_OVERFLOW_32_BITS("MAE__GCD_OVERFLOW_32_BITS"), MAE__GCD_OVERFLOW_64_BITS("MAE__GCD_OVERFLOW_64_BITS"), MAE__LCM_OVERFLOW_32_BITS("MAE__LCM_OVERFLOW_32_BITS"), MAE__LCM_OVERFLOW_64_BITS("MAE__LCM_OVERFLOW_64_BITS"), MAE__DIVISION_BY_ZERO("MAE__DIVISION_BY_ZERO"), Side remark: The argument to the enum element is the same as the enum element's name; is there a way to avoid the duplication (i.e. the string would be generated automatically)? Good point! Originally I was considering using numbers for each group of exceptions. For example 100 for arithmetic exceptions, 200 for matrix exceptions, etc. But I think strings are fine. So now it's down to this: https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/main/java/com/fireflysemantics/math/exception/ExceptionTypes.java /** * MATH ARITHMETIC EXCEPTIONS */ MAE, MAE__INTEGER_OVERFLOW, MAE__LONG_OVERFLOW, MAE__OVERFLOW_IN_ADDITION, MAE__OVERFLOW_IN_SUBTRACTION, MAE__GCD_OVERFLOW_32_BITS, MAE__GCD_OVERFLOW_64_BITS, MAE__LCM_OVERFLOW_32_BITS, MAE__LCM_OVERFLOW_64_BITS, MAE__DIVISION_BY_ZERO, ... So by looking at the exception type we know exactly what the issue is. With this approach CM will always only have 1 exception. If more types are needed then just add another line to the ExceptionTypes Enum. The new type is used to look up the message template in the I18N resource bundle. It looks neat. Thanks :) But I did not see how localization is handled. I did leave localization out. I think localization was a hard requirement in earlier versions of CM, but I'm hoping that there is some flexibility on this There was not, since I argued many times to leave it out. So unless you can show practically how it can work, I have my doubts that we'll be allowed to go forward with this approach. and that future versions can defer to a utility that uses the ExceptionTypes Enum instance as the key to look up the internationalized template string. Looks good. Where is the code? ;-) So CM clients would: catch(MathException e) { String exceptionTemplate = ResourceBundle.getBundle("cm.exception.templates", new Locale("en", "US")).getString(e.getType()); String i18Nmessage = buildMessage(exceptionTemplate, e.getContext()); ... } I can prototype that out more. Just trying to get a feel for how viable the concept is first. I think it satisfies everyone's requirements with: - A single MathException (No hierarchy) That would not satisfy everyone. :-! - The ExceptionTypes Enum contains all the exception types - The ExceptionTypes Enum 'key' maps to the corresponding message 1 to 1 - The ExceptionFactory (Utility) throws exceptions, if necessary, that have always have a single unique root cause, such as NPEs I was wondering whether the "factory" idea could indeed satisfy everyone. Rather than throwing the non-standard "MathException", the factory would generate one of the standard exceptions, constructed with the internal "MathException" as its cause: I think it's good that CM throws CM specific exceptions. This way when I write the handler I can know that the exception is CM specific without having to unwrap it. But if there are several CM exceptions hierarchies, the handler will have to check for every base type, leading to more code. True dat - but if there are no CM exception hierarchies then they don't :). We could provide a utility: public boolean isMathException(RuntimeException e) { if (e
Re: [Math] Exceptions from "JDKRandomGenerator"
. One of the point in having exceptions that extends our own root exception is that users at higher level can catch this top level. Currently, we don't even advertise properly what we throw. We even miss to forward upward some exceptions thrown at low level in the javadoc/signature of out upper level methods. So user may currently not know, only reading the javadoc/signature of one of our implementation that they may get a MIAE or something else. If we were using a single root, Or just a single MathException. they would at least be able to do a catch (MathRootException) that would prevent a runtime exception to bubble up too far. Currently, defensive programming to protect against this failure is to catch all of MathArithmeticException, MathIllegalArgumentException, MathIllegalNumberException, MathIllegalStateException, MathUnsupportedOperationException, and MathRuntimeException. With the design I proposed, and the design I'm using, we only have to catch one. After it's caught the type code (Enum) indicates precisely what the issue is. In a perfect world, we would be able to extend a regular IAE while implementing a MathRootException, but Throwable in Java is a class, not an interface. Too bad. Luc - how do you feel about a single MathException that extends RuntimeException with an Enum that indicates what the root cause is and can be used as the key to look up the corresponding message template which can be resolved into a message using parameters attached to the MathException context? Here's an example from my refactoring of ArithmeticUtils: https://github.com/firefly-math/firefly-math-arithmetic/blob/master/src/main/java/com/fireflysemantics/math/arithmetic/Arithmetic.java /** * Add two integers, checking for overflow. * * @param x *an addend * @param y *an addend * @return the sum {@code x+y} * @throws MathException * Of type {@code MAE__OVERFLOW_IN_ADDITION} if the result can * not be represented as an {@code int}. */ public static int addAndCheck(int x, int y) throws MathException { long s = (long) x + (long) y; if (s < Integer.MIN_VALUE || s > Integer.MAX_VALUE) { throw new MathException(MAE__OVERFLOW_IN_ADDITION).put(X, x).put(Y, y); } return (int) s; } The toString() method of the exception is implemented like this and delivers the exception root cause (Enum) and parameter name and value pairs: @Override public String toString() { String parameters = context.entrySet().stream().map(e -> e.getKey() + "=" + e.getValue()).collect(Collectors.joining(", ")); return "Firefly math exception type " + this.type + ". Context [" + parameters + "]"; } So we get a pretty good indication of what the issue is by just using toString() to construct the message. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Exception Design
Hi, I was considering jumping into the JDKRandomGenerator exception discussion, but I did not want to hijack it. Not sure if any of you have had a chance to looks at this: https://github.com/firefly-math/firefly-math-exceptions/ https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/main/java/com/fireflysemantics/math/exception/MathException.java I think it satisfies everyone's requirements with: - A single MathException (No hierarchy) - The ExceptionTypes Enum contains all the exception types - The ExceptionTypes Enum 'key' maps to the corresponding message 1 to 1 - The ExceptionFactory (Utility) throws exceptions, if necessary, that have always have a single unique root cause, such as NPEs - The context captures the exception parameters keyed by an the 'ExceptionKeys' enum. Each module can introduce more keys as necessary. - The toString() method can be used as the initial exception message The way developers should deal with this exception is: 1) Catch 2) Get the type (Enum) 3) Get the parameters (Context) 4) Get the method that threw it 5) Get the class threw it 6) Rethrow the above in an application specific exception, log it, or display it. Construct a localized message using the enum type to look up the exception template if needed. WDYT? Cheers, - Ole P.S. Here's the entire test demo (Pasted below for convenience): https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/test/java/com/fireflysemantics/math/exceptions/MathExceptionTest.java /** * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package com.fireflysemantics.math.exceptions; import static com.fireflysemantics.math.exception.ExceptionKeys.CONSTRAINT; import static com.fireflysemantics.math.exception.ExceptionKeys.VALUE; import static com.fireflysemantics.math.exception.ExceptionTypes.NUMBER_TOO_SMALL; import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertTrue; import org.junit.Test; import com.fireflysemantics.math.exception.ExceptionFactory; import com.fireflysemantics.math.exception.MathException; public class MathExceptionTest { @Test(expected = MathException.class) public void verifyThrows() { throw new MathException(NUMBER_TOO_SMALL); } @Test public void verifyCode() { try { throw new MathException(NUMBER_TOO_SMALL); } catch (MathException e) { assertEquals(e.getType(), NUMBER_TOO_SMALL); } } @Test public void verifyContext() { try { throw new MathException(NUMBER_TOO_SMALL).put(CONSTRAINT, 2).put(VALUE, 1); } catch (MathException e) { assertEquals(e.get(CONSTRAINT), 2); assertEquals(e.get(VALUE), 1); } } @Test public void verifyToString() { try { throw new MathException(NUMBER_TOO_SMALL).put(CONSTRAINT, 2).put(VALUE, 1); } catch (MathException e) { assertTrue(e.toString().contains(NUMBER_TOO_SMALL.toString())); assertTrue(e.toString().contains("1")); assertTrue(e.toString().contains("2")); assertTrue(e.toString().contains(CONSTRAINT)); assertTrue(e.toString().contains(VALUE)); } } @Test public void verifyFactory() { try { ExceptionFactory.throwNumberToSmallException(1, 2, "foo"); } catch (MathException e) { assertTrue(e.getType() == NUMBER_TOO_SMALL); assertEquals(e.get(CONSTRAINT), new Integer(2)); assertEquals(e.get(VALUE), new Integer(1)); assertEquals(e.get("foo"), new Integer(1)); } } } - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Updated FieldMatrix exceptions thrown to match javadoc.
On 12/21/2015 01:45 PM, Phil Steitz wrote: On 12/21/15 12:26 PM, Ole Ersoy wrote: Should look like this, with some typos fixed: /** ... * @throws MatrixDimensionMismatchException if the dimensions of * {@code destination} do not match those of {@code this}. * @throws NumberIsTooSmallException if {@code endRow < startRow} or * {@code endColumn < startColumn}. * @throws OutOfRangeException if the indices are not valid. */ void copySubMatrix(int startRow, int endRow, int startColumn, int endColumn, T[][] destination) throws MatrixDimensionMismatchException, NumberIsTooSmallException, OutOfRangeException; I will fix this. I think it's easier to understand if the "too small" wording is included. Something like: @throws MatrixDimensionMismatchException if the {@code destination} matrix dimensions are too small. No, the MatrixDimensionMismatchException is thrown whenever the dimensions don't match exactly, as it says above. Maybe I'm missing something, but if that's the case then it seems like it would be better to just create and return the sub matrix internally and avoid the exception all together? The way I understand the method is that we have a matrix and we want the left hand corner of the matrix to be replaced with the submatrix being copied into it? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Updated FieldMatrix exceptions thrown to match javadoc.
Should look like this, with some typos fixed: /** ... * @throws MatrixDimensionMismatchException if the dimensions of * {@code destination} do not match those of {@code this}. * @throws NumberIsTooSmallException if {@code endRow < startRow} or * {@code endColumn < startColumn}. * @throws OutOfRangeException if the indices are not valid. */ void copySubMatrix(int startRow, int endRow, int startColumn, int endColumn, T[][] destination) throws MatrixDimensionMismatchException, NumberIsTooSmallException, OutOfRangeException; I will fix this. I think it's easier to understand if the "too small" wording is included. Something like: @throws MatrixDimensionMismatchException if the {@code destination} matrix dimensions are too small. Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Exception Design
On 12/21/2015 06:44 PM, Gilles wrote: On Mon, 21 Dec 2015 12:14:16 -0600, Ole Ersoy wrote: Hi, I was considering jumping into the JDKRandomGenerator exception discussion, but I did not want to hijack it. Not sure if any of you have had a chance to looks at this: https://github.com/firefly-math/firefly-math-exceptions/ https://github.com/firefly-math/firefly-math-exceptions/blob/master/src/main/java/com/fireflysemantics/math/exception/MathException.java I had a rapid look; unfortunately not in sufficient details to grasp the major departures from the existing framework. Could you display one or two examples comparing CM and firefly? In addition to what I summarized below one detail that I think is important is that the ExceptionTypes enum allows for more exact targeting of the exception root cause. For instance right now I have the following arithmetic exception types: /** * MATH ARITHMETIC EXCEPTIONS */ MAE("MAE"), MAE__INTEGER_OVERFLOW("MAE__INTEGER_OVERFLOW"), MAE__LONG_OVERFLOW("MAE__LONG_OVERFLOW"), MAE__OVERFLOW_IN_ADDITION("MAE__OVERFLOW_IN_ADDITION"), MAE__OVERFLOW_IN_SUBTRACTION("MAE__OVERFLOW_IN_SUBTRACTION"), MAE__GCD_OVERFLOW_32_BITS("MAE__GCD_OVERFLOW_32_BITS"), MAE__GCD_OVERFLOW_64_BITS("MAE__GCD_OVERFLOW_64_BITS"), MAE__LCM_OVERFLOW_32_BITS("MAE__LCM_OVERFLOW_32_BITS"), MAE__LCM_OVERFLOW_64_BITS("MAE__LCM_OVERFLOW_64_BITS"), MAE__DIVISION_BY_ZERO("MAE__DIVISION_BY_ZERO"), So by looking at the exception type we know exactly what the issue is. With this approach CM will always only have 1 exception. If more types are needed then just add another line to the ExceptionTypes Enum. The new type is used to look up the message template in the I18N resource bundle. It looks neat. Thanks :) But I did not see how localization is handled. I did leave localization out. I think localization was a hard requirement in earlier versions of CM, but I'm hoping that there is some flexibility on this and that future versions can defer to a utility that uses the ExceptionTypes Enum instance as the key to look up the internationalized template string. I think it satisfies everyone's requirements with: - A single MathException (No hierarchy) That would not satisfy everyone. :-! - The ExceptionTypes Enum contains all the exception types - The ExceptionTypes Enum 'key' maps to the corresponding message 1 to 1 - The ExceptionFactory (Utility) throws exceptions, if necessary, that have always have a single unique root cause, such as NPEs I was wondering whether the "factory" idea could indeed satisfy everyone. Rather than throwing the non-standard "MathException", the factory would generate one of the standard exceptions, constructed with the internal "MathException" as its cause: I think it's good that CM throws CM specific exceptions. This way when I write the handler I can know that the exception is CM specific without having to unwrap it. public class ExceptionFactory { public static void throwIllegalArgumentException(MathException e) { throw new IllegalArgumentException(e); } public static void throwNullPointerException(MathException e) { throw new NullPointerException(e); } // And so on... } So, CM code would be public class Algo { public void evaluate(double value) { // Check precondition. final double min = computeMin(); if (value < min) { final MathException e = new MathException(NUMBER_TOO_SMALL).put(CONSTRAINT, min).put(VALUE, value); ExceptionFactory.throwIllegalArgumentException(e); } // Precondition OK. } } Another thing that I hinted to is that the the factory builds in the precondition check in the throw method. So that the line: if (value < min) { can be nixed. Then, in an application's code: public void appMethod() { // ... // Use CM. try { Algo a = new Algo(); a.evaluate(2); } catch (IllegalArgumentException iae) { final Throwable cause = iae.getCause(); if (cause instanceof MathException) { final MathException e = (MathException) cause; // Rethrow an application-specific exception that will make more sense // to my customers. throw new InvalidDataInputException(e.get(CONSTRAINT), e.get(VALUE)); } } } This is all untested. Did I miss something? I think you got it all...But the handler will be shorter if the exception is not wrapped. The pattern I'm used to is that libraries wrap the exceptions of other libraries in order to offer a standardized facade to the user. For example Spring wraps Hibernate exceptions, since Spring is a layer on top of Hibernate and other data access providers. Ole Gilles - The context captures the exception parameters keyed by an the 'ExceptionKeys' enum. Each module can introduce more key
[math] Jitpack.io
In the process of making the firefly modules available for automatic install via maven I came across this: https://jitpack.io/ Thought it might help ease publishing of test artifacts for math. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Updated FieldMatrix exceptions thrown to match javadoc.
On 12/18/2015 04:07 PM, Phil Steitz wrote: On 12/18/15 2:59 PM, Ole Ersoy wrote: I think it makes sense. If the destination array is too small, throw an IAE. Right. That is what the implementations do - it is just a specialized IAE. We decided a while back not to throw "raw" IAE but to use things like MatrixDimensionMismatchException which is what AbstractFieldMatrix does for the case described in the javadoc. Got it - thanks for the heads up. So in this case should: @throws MatrixDimensionMismatchException if the dimensions of {@code destination} do not match those of {@code this}. Be replaced with: @throws MatrixDimensionMismatchException if the destination array is to small. Which should replace the IAE? Ole Phil Perhaps the implementations need to be updated. I'm attempting to modularize the linear package ATM so I'll have a closer look. Cheers, - Ole On 12/18/2015 01:31 PM, Phil Steitz wrote: It does not look to me like any implementation we have of this interface actually throws raw IAE anywhere. I think maybe it is the javadoc that is wrong. On 12/18/15 4:47 AM, l...@apache.org wrote: Repository: commons-math Updated Branches: refs/heads/master abb205795 -> 5566a21d2 Updated FieldMatrix exceptions thrown to match javadoc. Github: closes #20 Project: http://git-wip-us.apache.org/repos/asf/commons-math/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-math/commit/5566a21d Tree: http://git-wip-us.apache.org/repos/asf/commons-math/tree/5566a21d Diff: http://git-wip-us.apache.org/repos/asf/commons-math/diff/5566a21d Branch: refs/heads/master Commit: 5566a21d2b34090d1ce8129f41b551a1187e7d5b Parents: abb2057 Author: Luc Maisonobe <l...@apache.org> Authored: Fri Dec 18 12:47:13 2015 +0100 Committer: Luc Maisonobe <l...@apache.org> Committed: Fri Dec 18 12:47:13 2015 +0100 -- src/main/java/org/apache/commons/math4/linear/FieldMatrix.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/commons-math/blob/5566a21d/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java -- diff --git a/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java b/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java index 0db94b9..4c0ad9f 100644 --- a/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java +++ b/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java @@ -195,7 +195,7 @@ public interface FieldMatrix> extends AnyMatrix { void copySubMatrix(int startRow, int endRow, int startColumn, int endColumn, T[][] destination) throws MatrixDimensionMismatchException, NumberIsTooSmallException, -OutOfRangeException; +OutOfRangeException, IllegalArgumentException; /** * Copy a submatrix. Rows and columns are indicated . - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] ArithmeticUtils subAndCheck try catch
The (This is nit picky) ArithmeticUtils subAndCheck uses a message template that is meant for addition. Should it catch and rethrow the exception with a subtraction template? This is how the exception is thrown (Line 470): https://github.com/apache/commons-math/blob/master/src/main/java/org/apache/commons/math4/util/ArithmeticUtils.java throw new MathArithmeticException(LocalizedFormats.OVERFLOW_IN_ADDITION, a, -b); Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Updated FieldMatrix exceptions thrown to match javadoc.
I think it makes sense. If the destination array is too small, throw an IAE. Perhaps the implementations need to be updated. I'm attempting to modularize the linear package ATM so I'll have a closer look. Cheers, - Ole On 12/18/2015 01:31 PM, Phil Steitz wrote: It does not look to me like any implementation we have of this interface actually throws raw IAE anywhere. I think maybe it is the javadoc that is wrong. On 12/18/15 4:47 AM, l...@apache.org wrote: Repository: commons-math Updated Branches: refs/heads/master abb205795 -> 5566a21d2 Updated FieldMatrix exceptions thrown to match javadoc. Github: closes #20 Project: http://git-wip-us.apache.org/repos/asf/commons-math/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-math/commit/5566a21d Tree: http://git-wip-us.apache.org/repos/asf/commons-math/tree/5566a21d Diff: http://git-wip-us.apache.org/repos/asf/commons-math/diff/5566a21d Branch: refs/heads/master Commit: 5566a21d2b34090d1ce8129f41b551a1187e7d5b Parents: abb2057 Author: Luc MaisonobeAuthored: Fri Dec 18 12:47:13 2015 +0100 Committer: Luc Maisonobe Committed: Fri Dec 18 12:47:13 2015 +0100 -- src/main/java/org/apache/commons/math4/linear/FieldMatrix.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/commons-math/blob/5566a21d/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java -- diff --git a/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java b/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java index 0db94b9..4c0ad9f 100644 --- a/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java +++ b/src/main/java/org/apache/commons/math4/linear/FieldMatrix.java @@ -195,7 +195,7 @@ public interface FieldMatrix> extends AnyMatrix { void copySubMatrix(int startRow, int endRow, int startColumn, int endColumn, T[][] destination) throws MatrixDimensionMismatchException, NumberIsTooSmallException, -OutOfRangeException; +OutOfRangeException, IllegalArgumentException; /** * Copy a submatrix. Rows and columns are indicated . - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Arithmetic module
Hi, I just published an arithmetic module / ArithmeticUtils repackaged. I would love to get some feedback on what you think of the exception design...any improvements...Do you think it could work for CM in general, etc? The exception design strips localization, but it should be very easy to retrofit this within CM dependent applications due to the way each ExceptionTypes enum is coded. https://github.com/firefly-math/firefly-math-arithmetic https://github.com/firefly-math/firefly-math-arithmetic/blob/master/src/main/java/com/fireflysemantics/math/arithmetic/Arithmetic.java Depends on: https://github.com/firefly-math/firefly-math-exceptions One thing that is non standard is that I'm documenting the exception type like this: * @throws MathException[NOT_POSITIVE] * if {@code y < 0}. * @throws MathException[MAE__LONG_OVERFLOW] * if the result would overflow. Also one of the tests requires the RandomDataGenerator. https://github.com/firefly-math/firefly-math-arithmetic/issues/1 I have not taken the time to understand this test yet, but would love it if it's possible to just generate and test with static data instead, assuming that's a valid approach...? Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] AbstractFieldMatrix.checkMultiplicationCompatible() throws exception?
Actually I think I see why bubbling exceptions is better than performing boolean checks...so just ignore, unless there is some merit to it... Cheers, Ole On 12/16/2015 01:08 PM, Ole Ersoy wrote: Hi, I'm working on making the linear package standalone. The methods that perform precondition checks for matrix operations throw exceptions (See below). An option would be return a boolean instead. Obviously I would love it if CM adopts the code at some point, so I want to check whether changing the interface is going to kill kittens. Cheers, - Ole CURRENT /** * Check if a matrix is multiplication compatible with the instance. * * @param m *Matrix to check. * @throws DimensionMismatchException * if the matrix is not multiplication-compatible with instance. */ protected void checkMultiplicationCompatible(final FieldMatrix m) throws DimensionMismatchException { if (getColumnDimension() != m.getRowDimension()) { throw new DimensionMismatchException(m.getRowDimension(), getColumnDimension()); } } PROPOSED /** * Check if a matrix is multiplication compatible with the instance. * * @param m *Matrix to check. * @return true if the matrix is multiplication compatible, false otherwise. */ protected boolean checkMultiplicationCompatible(final FieldMatrix m) { if (getColumnDimension() != m.getRowDimension()) { return false; } return true; } - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] AbstractFieldMatrix.checkMultiplicationCompatible() throws exception?
Hi, I'm working on making the linear package standalone. The methods that perform precondition checks for matrix operations throw exceptions (See below). An option would be return a boolean instead. Obviously I would love it if CM adopts the code at some point, so I want to check whether changing the interface is going to kill kittens. Cheers, - Ole CURRENT /** * Check if a matrix is multiplication compatible with the instance. * * @param m *Matrix to check. * @throws DimensionMismatchException * if the matrix is not multiplication-compatible with instance. */ protected void checkMultiplicationCompatible(final FieldMatrix m) throws DimensionMismatchException { if (getColumnDimension() != m.getRowDimension()) { throw new DimensionMismatchException(m.getRowDimension(), getColumnDimension()); } } PROPOSED /** * Check if a matrix is multiplication compatible with the instance. * * @param m *Matrix to check. * @return true if the matrix is multiplication compatible, false otherwise. */ protected boolean checkMultiplicationCompatible(final FieldMatrix m) { if (getColumnDimension() != m.getRowDimension()) { return false; } return true; } - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Refactored Precision
Hi Thomas, On 12/14/2015 06:37 AM, Thomas Neidhart wrote: On Mon, Dec 14, 2015 at 9:17 AM, Ole Ersoy <ole.er...@gmail.com> wrote: Hi, Just a heads up for those of you interested or have nothing better to do at 2 am :). I refactored the Precision class into classes PrecisionAssert and RoundDouble. https://github.com/firefly-numbers/firefly-numbers I created a new github organization for the package, since it deals less with math and more with number precision in general. I also removed support for float. It seems like most of the code in CM uses double, and if float is needed then it should be provided via it's own module. I also replaced calls to FastMath with Math. Most of the calls were for abs() and ulp()...functions that I would think would have similar performance regardless. Probably moving onto FastMath next. I plan on only including functions that have a performance benefit, and delegating to Math for everything else. Hi Ole, what is the motivation for posting these questions on the math ML? Do you intend to contribute some new functionality or propose changes back to commons-math? Or is this project intended to be a fork of commons-math? Thomas I sent a few emails earlier regarding the precision code with questions that were questions... We have been discussing refactoring CM, so I've started what can be thought of as a Java 8 (/leaning towards Java 9) useable prototype of such such refactoring. As I'm going through the process and reviewing the code, I ask questions (Hopefully good ones) when I find something that I think could be simplified, etc. So far there are probably 6 big changes that CM that might be useful for CM: 1) Lombok to reduce boileplate code (Can be seen in the new exception module - Also generates javadoc) 2) Removal of float precision utilities in Precison 3) Java 8 and Java 8 constructs 4) Coming (Observer design for Optimizers) 5) The dependency structure of the modules 6) Hopefully increased generic use for modules like the numbers module Fork usually has a negative connotation. When architects draw up designs there are usually several that can be used for comparison and contrast. The primary purpose of sharing the results of the refactoring is that. I probably should be make it clear that I am 100% for CM. I think the developers and contributors are amazing and I have tremendous respect for all of you. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Refactored Precision
Hi, Just a heads up for those of you interested or have nothing better to do at 2 am :). I refactored the Precision class into classes PrecisionAssert and RoundDouble. https://github.com/firefly-numbers/firefly-numbers I created a new github organization for the package, since it deals less with math and more with number precision in general. I also removed support for float. It seems like most of the code in CM uses double, and if float is needed then it should be provided via it's own module. I also replaced calls to FastMath with Math. Most of the calls were for abs() and ulp()...functions that I would think would have similar performance regardless. Probably moving onto FastMath next. I plan on only including functions that have a performance benefit, and delegating to Math for everything else. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Another Question about Precision.round()
Hi, In Precision there are these two methods: [1] public static double round(double x, int scale, int roundingMethod) [2] public static float round(float x, int scale, int roundingMethod) The implementations for each are different. For [2] could users just convert float to double and use [1] instead? Cheers, Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[Math] Precision.roundUnscaled BIG_DECIMAL
Hi, I'm creating a new utilities module and I'm trying to decide whether to keep the below code block (Contained in the Precision.roundUnscaled() method - BTW it contains a new exception type I'm playing with - see https://github.com/firefly-math/firefly-math-exceptions if interested - the utilities module will be published soon putting the new exception code to validated use ... hopefully - anyways here's the block): case BigDecimal.ROUND_UNNECESSARY: if (unscaled != FastMath.floor(unscaled)) { throw new MathException(ExceptionTypes.MATH_ARITHMETIC_EXCEPTION); } break; So I would pass in the unscaled argument and say that rounding is not necessary. Then if the exception is thrown, I know that presumably I should have chosen one of the other BigDecimal rounding operations...? To me it seems simpler to check for the required operation like this before proceeding: if (unscaled != FastMath.floor(unscaled)) { // If necessary throw an application specific exception } Unless I'm missing something big, I'll remove the above code block, as well as the MATH_ARITHMETIC_EXCEPTION in the firefly-math-utilities moduleWDYT? Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] additions to MathArrays
Notes inline... On 11/24/2015 08:28 AM, Gilles wrote: On Tue, 24 Nov 2015 06:52:04 -0700, Phil Steitz wrote: I need the following methods to complete the fix for MATH-1246. I can add them as private methods to the KS class; but they seem generally useful, so I propose adding them to MathArrays. Any objections? /** * Concatenates two arrays. * * @param x first array * @param y second array * @return a new array consisting of the entries of x followed by the entries of y * @throws NullPointerException if either x or y is null */ public static double[] concat(double[] x, double[] y) I'd propose public static double[] cat(double[] ... arrays) Java 8 has an addAll method that is similar to the cat method: http://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html#addAll-int-java.util.Collection- Perhaps use a similar naming convention... The cat API is also does what Javascript's splice method does, so that might be a better name. http://www.w3schools.com/jsref/jsref_splice.asp /** * Returns an array consisting of the unique values in {@code data}. * The return array is sorted in descending order. * * @param data * @return descending list of values included in the input array */ public static double[] values(double[] data) I'd suggest public static double[] uniq(double[] data) Just a note - Java 8 Streams make the implementation fairly short: |Integer[]array =newInteger[]{5,10,20,58,10};Stream.of(array).distinct().forEach(i ->System.out.print(" "+i)); 5, 10, 20, 58 //Printed | Cheers, - Ole /** * Adds random jitter to {@code data} using deviates sampled from {@code dist}. * * Note that jitter is applied in-place - i.e., the array * values are overwritten with the result of applying jitter. * * @param data input/output data array * @param dist probability distribution to sample for jitter values * @throws NullPointerException if either of the parameters is null */ public static void jitter(double[] data, RealDistribution dist) IMO, this method should be part of the new API proposed in https://issues.apache.org/jira/browse/MATH-1158 Like so: /** {@inheritDoc} */ public RealDistribution.Sampler createSampler(final RandomGenerator rng) { return new RealDistribution.Sampler() { public double next() { /* ... */ } public double[] next(int sampleSize) { /* ... */ } /** * @param data input/output data array. * @param dist probability distribution to sample for jitter values * @throws NullPointerException if data array is null. */ public void jitter(double[] data) { final int len = data.length; final double[] jit = next(len); for (int i = 0; i < len; i++) { data[i] += jit[i]; } } }; } Advantages: * Simpler to use (half as many arguments). :-) * Utility is defined where the base functionality is defined. * Avoid adding to the overcrowded MathArrays utility class. * Avoid dependency to another package (will help with modularization). Drawbacks from usage POV * None (?) Drawbacks from implementation POV * Cannot be done before we agree to go forward with the aforementioned issue. In the meantime, I suggest to implement it as a "private" method. Regards, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Version mgt idea
On 11/13/2015 08:12 AM, Gilles wrote: On Mon, 9 Nov 2015 10:34:43 -0600, Ole Ersoy wrote: If I'm interested in some functionality that is 'beta' then I first have to realize that it's 'beta'...Maybe just tag the branch beta. After that there's probably (Judging from the number of people communicating here) 1/2 people interested. Isn't it easier for them to just just check out the beta branch and manage the process internally using their own internal 'beta' naming convention until they are happy with the quality? [IIUC] With this approach, I think that we won't get feedback from would-be testers who don't want to have to "manually" download and compile. You could be right. Personally, I'm fine doing: `git clone ...` `mvn install` `get to work..` I'm assuming the critical feedback is being targeted at the more complex components that are hard to find in the wild, so testers have a natural impetus to jump through a few hoops. We need to _release_ the beta versions. Also, for CM developers, what you describe does not seem to add anything: they already can create branches and experiment locally. True. I'm assuming that someone needing a feature will first google, find out that what they are looking for is available - and is potentially better in the beta version - so they will be ok with building and experimenting with the branch locally. From here it looks like the core problem is that there are two many modules intermingled in CM causing friction / weighing down the release process for each of the individual modules (Which I'm currently imagining). Could be... :-} The complexity of multiple interdependent features being refactored simultaneously might also turn some off. If multiple beta branches are created isolating each change it could make it easier to request and process feedback. Cheers, - Ole Regards, Gilles Cheers, - Ole On 11/06/2015 06:51 PM, Gary Gregory wrote: On Fri, Nov 6, 2015 at 4:02 PM, Phil Steitz <phil.ste...@gmail.com> wrote: On 11/6/15 4:46 PM, Gary Gregory wrote: On Fri, Nov 6, 2015 at 3:01 PM, Phil Steitz <phil.ste...@gmail.com> wrote: On 11/6/15 2:51 PM, Gary Gregory wrote: On Fri, 6 Nov 2015 09:17:18 -0700, Phil Steitz wrote: Here is an idea that might break our deadlock re backward compatibility, versioning and RERO: Agree that odd numbered versions have stable APIs - basically adhere to Commons rules - no breaks within 3.0, 3.1, ..., 3.x... or 5.0, 5.1... but even-numbered lines can include breaks - ... This sounds awfully complicated for my puny human brain. How, exactly? Seems pretty simple to me. The even-numbered release lines may have compat breaks; but the odd-numbered do not. It's bad enough that I have to remember how each FOSS project treats versions numbers, but having an exception within a Commons component is even worse. This is a non-starter for me. Do you have any better suggestions? The problem we are trying to solve is we can't RERO while sticking to the normal compat rules without turning major versions all the time, which forces users to repackage all the time and us to support more versions concurrently than we have bandwidth to do. I do not see how a different version scheme will determine how many branches the community supports. If we just keep one 4.x branch that keeps cutting (possibly incompatible) releases, that is just one line, one branch. If we have to cut 4.1, 4.2, 4.3 as 4, 5, 6 instead and we don't allow any compat breaks, we end up having to maintain and release 4.0.1, 5.0.1, 6.0.1 instead of just 4.3.1, for example, or we just strand the 4, 5 users in terms of bug fixes as we move on to 6. Breaking BC without a package and coord change is a no-go. We have done this before and we will probably do it again - and more if we have to don't separate out an unstable line. You have to think about this jar as a dependency that can be deeply nested in a software stack. Commons components are such creatures. I unfortunately run into this more than I'd like: Big FOSS project A depends on B which depends on C. Then I want to integrate with Project X which depends on Y which depends on different versions of B and C. Welcome to jar hell if B and C are not compatible. If B and C follow the rule of break-BC -> new package/coords, then all is well. The mitigation here is that we would not expect the even-numbered releases to be deployed widely. Respectfully Phil, my point is that while this might be true, it is in practice irrelevant. We cannot control the spread of our jars and their usage, hence the importance of BC. Gary Phil Gary Phil Gary - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.
Re: [math] Smaller Packages / Artifacts / Dependencies
On 11/07/2015 04:00 AM, Gilles wrote: On Fri, 6 Nov 2015 15:06:35 -0600, Ole Ersoy wrote: If math is broken up into smaller artifacts it will make it easier for users to upgrade, even if it it breaks compatibility, as well as speed up the release frequency. So for example: commons-math-optimization (Or even more granular commons-math-optimization-lp, commons-math-optimization-ga, commons-math-optimization-nlp, etc) commons-math-simulation commons-math-statistics commons-math-ai (Neural Networks, ...) etc. I also believe that modularity is a worthy goal. A first step would be to collect some statistics on inter-package dependencies. Personally I like modules and repositories that are very small and focused with as few dependencies as possible. I'm probably in the extreme bulleye of the circle on this. The reason I like it is because I can read a few usage lines in the github README.md and go. It's easy to contribute to and minimizes indirection. For example I think the optimizers are complex enough to warrant their own module. The distributions probably belong in a single module, etc. I'm still in the process of getting a demo repository setup, but it will be along these lines. Once that's done it should make it really simple for someone to just clone, build, and get to work. It's nice if it's on Maven, but if the module is tiny, and easy to verify visually, then cloning and building is a reasonable way to get things done. There will certainly be a "commons-math-core" containing packages like "o.a.c.m.util" and "o.a.c.m.exception". [At some point, releasing separate JARs could also provide us with indirect feedback on which parts of CM are actually used.] And the stars on github are a pretty good indicator as well. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Version mgt idea
If I'm interested in some functionality that is 'beta' then I first have to realize that it's 'beta'...Maybe just tag the branch beta. After that there's probably (Judging from the number of people communicating here) 1/2 people interested. Isn't it easier for them to just just check out the beta branch and manage the process internally using their own internal 'beta' naming convention until they are happy with the quality? From here it looks like the core problem is that there are two many modules intermingled in CM causing friction / weighing down the release process for each of the individual modules (Which I'm currently imagining). Cheers, - Ole On 11/06/2015 06:51 PM, Gary Gregory wrote: On Fri, Nov 6, 2015 at 4:02 PM, Phil Steitzwrote: On 11/6/15 4:46 PM, Gary Gregory wrote: On Fri, Nov 6, 2015 at 3:01 PM, Phil Steitz wrote: On 11/6/15 2:51 PM, Gary Gregory wrote: On Fri, 6 Nov 2015 09:17:18 -0700, Phil Steitz wrote: Here is an idea that might break our deadlock re backward compatibility, versioning and RERO: Agree that odd numbered versions have stable APIs - basically adhere to Commons rules - no breaks within 3.0, 3.1, ..., 3.x... or 5.0, 5.1... but even-numbered lines can include breaks - ... This sounds awfully complicated for my puny human brain. How, exactly? Seems pretty simple to me. The even-numbered release lines may have compat breaks; but the odd-numbered do not. It's bad enough that I have to remember how each FOSS project treats versions numbers, but having an exception within a Commons component is even worse. This is a non-starter for me. Do you have any better suggestions? The problem we are trying to solve is we can't RERO while sticking to the normal compat rules without turning major versions all the time, which forces users to repackage all the time and us to support more versions concurrently than we have bandwidth to do. I do not see how a different version scheme will determine how many branches the community supports. If we just keep one 4.x branch that keeps cutting (possibly incompatible) releases, that is just one line, one branch. If we have to cut 4.1, 4.2, 4.3 as 4, 5, 6 instead and we don't allow any compat breaks, we end up having to maintain and release 4.0.1, 5.0.1, 6.0.1 instead of just 4.3.1, for example, or we just strand the 4, 5 users in terms of bug fixes as we move on to 6. Breaking BC without a package and coord change is a no-go. We have done this before and we will probably do it again - and more if we have to don't separate out an unstable line. You have to think about this jar as a dependency that can be deeply nested in a software stack. Commons components are such creatures. I unfortunately run into this more than I'd like: Big FOSS project A depends on B which depends on C. Then I want to integrate with Project X which depends on Y which depends on different versions of B and C. Welcome to jar hell if B and C are not compatible. If B and C follow the rule of break-BC -> new package/coords, then all is well. The mitigation here is that we would not expect the even-numbered releases to be deployed widely. Respectfully Phil, my point is that while this might be true, it is in practice irrelevant. We cannot control the spread of our jars and their usage, hence the importance of BC. Gary Phil Gary Phil Gary - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[math] Smaller Packages / Artifacts / Dependencies
If math is broken up into smaller artifacts it will make it easier for users to upgrade, even if it it breaks compatibility, as well as speed up the release frequency. So for example: commons-math-optimization (Or even more granular commons-math-optimization-lp, commons-math-optimization-ga, commons-math-optimization-nlp, etc) commons-math-simulation commons-math-statistics commons-math-ai (Neural Networks, ...) etc. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: Proposed Contribution to Apache Commons,
Hi, On 09/30/2015 01:44 AM, Jochen Wiedmann wrote: Hi, Norman, On Tue, Sep 29, 2015 at 10:06 PM,wrote: My colleague Jeff Rothenberg and I, retirees, have developed an alternative to using regular expressions for searching for (and optionally replacing) patterns in text. Something that is more fluid and can communicate the semantics of the operation as it is written would be nice. I admit that I am becoming somewhat cautious by reading about "an alternative" Agree - most regular expression engines seem to be battled tested fairly well. I think the complex part of regular expressions is building it and interpreting it, so if something helps with that, thumbs up! to one of the best researched and understood parts of the theory of computer science, namely regular expressions. We know exactly, how powerful regexes are. And we also know exactly, what they can't do. Regular expressions remind me of punch cards. If you know what you are doing, and have invested the time to prove that what the expression is doing is what it's supposed to be doing, it's great. However reading, rationalizing, and testing: |^(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[13-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$ Can be tricky. So if there's a more fluid way to build and communicate what the above sequence is doing, that would be awesome. | Regarding your work, we don't even know how it would look like. Perhaps if some simple examples were posted on the mailing list the developers here could get a better feel for it. Because, frankly, docs are quite sparse. And, what I found (SimpleExamples.java) looks to me more like a different API to creating regexes, rather than an alternative. And, why bother about that? I never felt it to be painful to specify a regex in text. Honestly I feel the pain here. I have been close to slipping into a coma several times while analysing a regular expression. We all know programmers make mistakes ( Not me :) ), even in simple situations occasionally, so given how complex some regexs can get... Cheers, - Ole Did so as a Perl programmer, as an Apache programmer (initially with Oro, and the like) and nowadays as a Java programmer. Sorry, but unless you come up with compelling reasons, I am -0 to -1. Jochen
Re: [Math] Utilization of Lombok
On 09/27/2015 11:14 PM, venkatesha murthy wrote: Do we know if lombok is supported on all flavours of java for instance IBM JDK, Open JDK , java 8 etc... Was just thinking of the future proof readiness. Iam absolutely interested in lombok and even today use it for most demo purposes and have been stressing for its use where Oracle JDK is the primary use. however i have always been cautioned/warned in the past of using it in production systems due to the lombok non-support on different java platforms. More info here: http://stackoverflow.com/questions/6107197/how-does-lombok-work I would still future proof by wrapping the solution in a Docker container when possible. Cheers, - Ole On Fri, Sep 25, 2015 at 11:47 PM, Ole Ersoy <ole.er...@gmail.com> wrote: On 09/25/2015 12:55 PM, Thomas Neidhart wrote: On 09/25/2015 05:04 PM, Ole Ersoy wrote: Hi Thomas, On 09/25/2015 08:45 AM, Thomas Neidhart wrote: Hi Ole, can you explain why you think that the addition of lombok brings any benefit to our users? Sure - I'm looking at the LevenbergMarquardtOptimizer ATM, and it has the following set of parameters: /* configuration parameters */ /** Positive input variable used in determining the initial step bound. */ private final double initialStepBoundFactor; /** Desired relative error in the sum of squares. */ private final double costRelativeTolerance; /** Desired relative error in the approximate solution parameters. */ private final double parRelativeTolerance; /** Desired max cosine on the orthogonality between the function vector * and the columns of the jacobian. */ private final double orthoTolerance; /** Threshold for QR ranking. */ private final double qrRankingThreshold; And corresponding getters: /** * Gets the value of a tuning parameter. * @see #withInitialStepBoundFactor(double) * * @return the parameter's value. */ public double getInitialStepBoundFactor() { return initialStepBoundFactor; } /** * Gets the value of a tuning parameter. * @see #withCostRelativeTolerance(double) * * @return the parameter's value. */ public double getCostRelativeTolerance() { return costRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withParameterRelativeTolerance(double) * * @return the parameter's value. */ public double getParameterRelativeTolerance() { return parRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withOrthoTolerance(double) * * @return the parameter's value. */ public double getOrthoTolerance() { return orthoTolerance; } /** * Gets the value of a tuning parameter. * @see #withRankingThreshold(double) * * @return the parameter's value. */ public double getRankingThreshold() { return qrRankingThreshold; } Lombok will generate all of these. Eclipse can do the same thing, but if we delete one of the parameters, then the corresponding getter also has to be deleted. Also Lombok cuts down on the source code noise, since it is a byte code generator. The generated code does not appear in the source. Lombok also has a @Builder annotation that can be used to generate a inner static builder class that provides a fluent construction API. So if we break off the LevenbergMarquardtOptimizer configuration into its own class, and generate all the getters and the fluid API, there should be substantial code reduction. Gilles is also working on a snapshot capability for neural nets, and the @Synchronized annotation could come in handy here. These are the items I have looked at so far. >From my point of view, lombok can help developers by taking over some tedious tasks, but this is quite irrelevant in the case of CM as the majority of work goes into algorithm design and verification rather than in writing getters/setters (which btw has pretty good IDE support). I agree that the majority of time goes into the design of the algorithm. For me personally, when I'm looking at code, and it has a ton of boilerplate, it does slow my productivity...just because of all the noise. I'm happy once I've gotten it as DRY as possible. The more boilerplate, the more reluctant we are going to be to undertake refactoring, and we will make more mistakes (At least I will :) ). So this would just add additional complexity and the gain is very unclear. I think you will find it refreshing once you try it. At this point though I just wanted to float the idea. I'll complete the experiment and publish the result. At that point we will have a good baseline to gage whether adding it will add enough value to offset the cost of adding it. Well I know lombok. Supe
Re: [Math] Utilization of Lombok
Hi Thomas, On 09/25/2015 08:45 AM, Thomas Neidhart wrote: Hi Ole, can you explain why you think that the addition of lombok brings any benefit to our users? Sure - I'm looking at the LevenbergMarquardtOptimizer ATM, and it has the following set of parameters: /* configuration parameters */ /** Positive input variable used in determining the initial step bound. */ private final double initialStepBoundFactor; /** Desired relative error in the sum of squares. */ private final double costRelativeTolerance; /** Desired relative error in the approximate solution parameters. */ private final double parRelativeTolerance; /** Desired max cosine on the orthogonality between the function vector * and the columns of the jacobian. */ private final double orthoTolerance; /** Threshold for QR ranking. */ private final double qrRankingThreshold; And corresponding getters: /** * Gets the value of a tuning parameter. * @see #withInitialStepBoundFactor(double) * * @return the parameter's value. */ public double getInitialStepBoundFactor() { return initialStepBoundFactor; } /** * Gets the value of a tuning parameter. * @see #withCostRelativeTolerance(double) * * @return the parameter's value. */ public double getCostRelativeTolerance() { return costRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withParameterRelativeTolerance(double) * * @return the parameter's value. */ public double getParameterRelativeTolerance() { return parRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withOrthoTolerance(double) * * @return the parameter's value. */ public double getOrthoTolerance() { return orthoTolerance; } /** * Gets the value of a tuning parameter. * @see #withRankingThreshold(double) * * @return the parameter's value. */ public double getRankingThreshold() { return qrRankingThreshold; } Lombok will generate all of these. Eclipse can do the same thing, but if we delete one of the parameters, then the corresponding getter also has to be deleted. Also Lombok cuts down on the source code noise, since it is a byte code generator. The generated code does not appear in the source. Lombok also has a @Builder annotation that can be used to generate a inner static builder class that provides a fluent construction API. So if we break off the LevenbergMarquardtOptimizer configuration into its own class, and generate all the getters and the fluid API, there should be substantial code reduction. Gilles is also working on a snapshot capability for neural nets, and the @Synchronized annotation could come in handy here. These are the items I have looked at so far. >From my point of view, lombok can help developers by taking over some tedious tasks, but this is quite irrelevant in the case of CM as the majority of work goes into algorithm design and verification rather than in writing getters/setters (which btw has pretty good IDE support). I agree that the majority of time goes into the design of the algorithm. For me personally, when I'm looking at code, and it has a ton of boilerplate, it does slow my productivity...just because of all the noise. I'm happy once I've gotten it as DRY as possible. The more boilerplate, the more reluctant we are going to be to undertake refactoring, and we will make more mistakes (At least I will :) ). So this would just add additional complexity and the gain is very unclear. I think you will find it refreshing once you try it. At this point though I just wanted to float the idea. I'll complete the experiment and publish the result. At that point we will have a good baseline to gage whether adding it will add enough value to offset the cost of adding it. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] Utilitzation of SLF4J?
Hi Thomas, On 09/25/2015 08:54 AM, Thomas Neidhart wrote: Hi Ole, for a start, I think you are asking the wrong question. First of all we need to agree that we want to add some kind of logging facility to CM. Well it has to be SLF4J because that's the one I'm most familiar with :). We did discuss having observers that can listen in on increment events that algorithms publish. This would provide a dependency free method for doing so with one drawback. Now everyone that wants the algorithm to log has to implement logging. If the outcome is positive, there are a handful of alternatives, some of them more viable than slf4j in the context of CM (e.g. JUL or commons-logging). Would you be upset if it was SLF4J? This is minor, but I like the @SLF4J annotation that Lombok provides. btw. the same discussion has been done for other commons components as well, and the result usually was: do not add logging I think for the reason that commons should not introduce transitive dependencies? This has been solved fairly well (Below). Cheers, - Ole Thomas On Fri, Sep 25, 2015 at 3:17 PM, Ole Ersoy <ole.er...@gmail.com> wrote: Hello, We have been discussing various ways to view what's happening internally with algorithms, and the topic of including SLF4J has come up. I know that this was discussed earlier and it was decided that CM is a low level dependency, therefore it should minimize the transitive dependencies that it introduces. The Java community has adopted many means of dealing with potential logging conflicts, so I'm requesting that we use SLF4J for logging. I know that JBoss introduced its own logging system, and this made me a bit nervous about this suggestion, so I looked up strategies for switching their logger out with SLF4J: http://stackoverflow.com/questions/14733369/force-jboss-logging-to-use-of-slf4j The general process I go through when working with many dependencies that might use commons-logging instead of SLF4J looks something like this: http://stackoverflow.com/questions/8921382/maven-slf4j-version-conflict-when-using-two-different-dependencies-that-requi With JDK9 individual modules can define their own isolated set of dependencies. At this point the fix should be a permanent. If someone has has a very intricate scenario that we have not yet seen, they could use (And probably should use) OSGi to isolate dependencies. WDYT? Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] LeastSquaresOptimizer Design
On 09/25/2015 06:55 AM, Gilles wrote: On Thu, 24 Sep 2015 21:41:10 -0500, Ole Ersoy wrote: On 09/24/2015 06:01 PM, Gilles wrote: On Thu, 24 Sep 2015 17:02:15 -0500, Ole Ersoy wrote: On 09/24/2015 03:23 PM, Luc Maisonobe wrote: Le 24/09/2015 21:40, Ole Ersoy a écrit : Hi Luc, I gave this some more thought, and I think I may have tapped out to soon, even though you are absolutely right about what an exception does in terms bubbling execution to a point where it stops or we handle it. Suppose we have an Optimizer and an Optimizer observer. The optimizer will emit three different events given in the process of stepping through to the max number of iterations it is allotted: - SOLUTION_FOUND - COULD_NOT_CONVERGE_FOR_REASON_1 - COULD_NOT_CONVERGE_FOR_REASON_2 - END (Max iterations reached) So we have the observer interface: interface OptimizerObserver { success(Solution solution) update(Enum enum, Optimizer optimizer) end(Optimizer optimizer) } So if the Optimizer notifies the observer of `success`, then the observer does what it needs to with the results and moves on. If the observer gets an `update` notification, that means that given the current [constraints, numbers of iterations, data] the optimizer cannot finish. But the update method receives the optimizer, so it can adapt it, and tell it to continue or just trash it and try something completely different. If the `END` event is reached then the Optimizer could not finish given the number of allotted iterations. The Optimizer is passed back via the callback interface so the observer could allow more iterations if it wants to...perhaps based on some metric indicating how close the optimizer is to finding a solution. What this could do is allow the implementation of the observer to throw the exception if 'All is lost!', in which case the Optimizer does not need an exception. Totally understand that this may not work everywhere, but it seems like it could work in this case. WDYT? With this version, you should also pass the optimizer in case of success. In most cases, the observer will just ignore it, but in some cases it may try to solve another problem, or to solve again with stricter constraints, using the previous solution as the start point for the more stringent problem. Another case would be to go from a simple problem to a more difficult problem using some kind of homotopy. Great - whoooh - glad you like this version a little better - for a sec I thought I had complete lost it :). IIUC, I don't like it: it looks like "GOTO"... Inside the optimizer it would work like this: while (!done) { if (can't converge) { observer.update(Enum.CANT_CONVERGE, this); } } That's fine. What I don't like is to have provision for changing the optimizer's settings and reuse the same instance. If the design of the optimizer allows for this, then the interface for the Observer would facilitate it. The person implementing the interface could throw an exception when they get the Enum.CANT_CONVERGE message, in which case the semantics are the same as they are now. On the other hand if the optimizer is not designed for reuse, perhaps for the reason that it causes more complexity than it's worth, the Observer interface could just exclude this aspect. The optimizer should be instantiated at the lowest possible level; it will report everything to the observer, but the "report" is not to be confused with the "optimizer". The design of the observer is flexible. It gives the person implementing the interface the ability to change the state of what is being observed. It's a bit like warming up leftovers. You are the observer. You grab yesterdays the pizza. Throw in in the microwave. The microwave is the optimizer. We hit the 30 second button, and check on the pizza. If we like it, we take it out, otherwise we hit 30 seconds again, or we throw the whole thing out, because we just realized that the Pizza rat took a chunk out: https://www.youtube.com/watch?v=UPXUG8q4jKU Then in the update method either modify the optimizer's parameters or throw an exception. If I'm referring to Luc's example of a high-level code "H" call to some mid-level code "M" itself calling CM's optimizer "CM", then "M" may not have enough info to know whether it's OK to retry "CM", but on the other hand, "H" might not even be aware that "M" is using "CM". So in this case the person implementing the Observer interface would keep the semantics that we have now. There is one important distinction though. The person uses the Enum parameter, indicating the root cause of the message, to throw their own (Meaningful to them) exception. As I tried to explain several times along the years (but failed to convince) is that the same problem exists with the exceptions: however detailed the message, it might not make sen
Re: [Math] Utilitzation of SLF4J?
On 09/25/2015 03:06 PM, Phil Steitz wrote: On 9/25/15 11:01 AM, Ole Ersoy wrote: On 09/25/2015 11:34 AM, Phil Steitz wrote: I disagree. Good tests, API contracts, exception management and documentation can and should eliminate the need for cluttering low-level library code with debug logging. Logging could be viewed as clutter. Constructed the right way, the logging statements could also be viewed as comments. I agree that good tests, API contracts, and in general keeping the design as minimalistic and simple as possible should be the first criteria to review before introducing logging. When the code is trivial I leave logging out, unless I need to put in a few statements in order to track collaboration between components in an algorithm. Other times the method is as simple as it gets and I have to have logging. For example take the LevenbergMarquardtOptimizer optimize() method. It's atomic in the sense that what's in the method belongs in the method. There are several loops within the main loop, etc. and tracking what's going, even with a observer being notified on each increment, would be far from elegant. Why, exactly do you need to "track what is going on?" I hope that I don't. Most of the time code we write, especially the CM code, is rock solid. When implementing complex algorithms we break it down to simple building blocks, reason about it, break it down some more, unit test it, and it looks fantastic. Then we hit a snag, and start putting in println statements. Then we delete these after the fix, cut up 5 chickens, and pray. If an issue occurs, then there is a low probability that it is a CM component. The traces can help prove that it is not the CM component. If the code is now rock solid, then the log statements bother no one. We put the code back in production, trace initially to make sure everything looks healthy, and then turn off logging. On the flip side if we notice that something else seems off, we're back to putting in the println statements again. Watching traces of CM components in production is an additional insurance policy for guaranteeing quality. Also if there is an issue, and someone needs to understand it fast, the best way is to watch data flow through the system. For new contributors, this could be a good way to get up to speed on an algorithm. If you need to do that as a user of the code, some kind of listener or API to give you the information that you need is appropriate. I agree, but as I showed with the LevenbergMarquardtOptimizer attempting inject code into the optimize() method is a non trivial exercise. Components should have a simple way to examine discrete steps that are being performed. Dumping text to an external resource to "solve" this usually indicates smelliness somewhere - either in the library API or the client code. Or it is a precaution to ensure that no one forgot to flush. For example perhaps we want to see what's going on with the parameters in this small (5% of the method size) section of code: What parameters and where did they come from? If from the client, the client can validate them. If the library needs to validate or confirm suitability, then it should do that in code or via tests. // compute the scaled predicted reduction // and the scaled directional derivative for (int j = 0; j < solvedCols; ++j) { int pj = permutation[j]; double dirJ = lmDir[pj]; work1[j] = 0; for (int i = 0; i <= j; ++i) { work1[i] += weightedJacobian[i][pj] * dirJ; } } If there is something wrong with this in production, the shortest path to figuring that out is reviewing a trace. The longer path is to stop the server. Grab the data. Open up a debugger. Run the data. If we wanted to observe this section of code the observer would be looking at sub loops of of the optimize method. So that's doable, but it creates an interface design for the observer that's cluttered with events corresponding to various low level algorithm details. Several times, I've been obliged to create a modified version of CM to introduce "print" statements (poor man's logging!) in order to figure out why my code did not do what it was supposed to. It's pretty tragic that anyone of us should have to do this. It's also wasteful, because if Gilles has to do this, then there's a good chance that others have to do it to. The reason Tomcat logs at various levels is so that we can see what's going on in production and track down bugs. No. Have a look at the Tomcat logging code. It is mostly initialization, shutdown and exceptions or warnings. I agree, but I would argue that these should be far simpler to reason about than the life cycle of some of the CM algorithms. This is nec
Re: [Math] Utilization of Lombok
On 09/25/2015 12:55 PM, Thomas Neidhart wrote: On 09/25/2015 05:04 PM, Ole Ersoy wrote: Hi Thomas, On 09/25/2015 08:45 AM, Thomas Neidhart wrote: Hi Ole, can you explain why you think that the addition of lombok brings any benefit to our users? Sure - I'm looking at the LevenbergMarquardtOptimizer ATM, and it has the following set of parameters: /* configuration parameters */ /** Positive input variable used in determining the initial step bound. */ private final double initialStepBoundFactor; /** Desired relative error in the sum of squares. */ private final double costRelativeTolerance; /** Desired relative error in the approximate solution parameters. */ private final double parRelativeTolerance; /** Desired max cosine on the orthogonality between the function vector * and the columns of the jacobian. */ private final double orthoTolerance; /** Threshold for QR ranking. */ private final double qrRankingThreshold; And corresponding getters: /** * Gets the value of a tuning parameter. * @see #withInitialStepBoundFactor(double) * * @return the parameter's value. */ public double getInitialStepBoundFactor() { return initialStepBoundFactor; } /** * Gets the value of a tuning parameter. * @see #withCostRelativeTolerance(double) * * @return the parameter's value. */ public double getCostRelativeTolerance() { return costRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withParameterRelativeTolerance(double) * * @return the parameter's value. */ public double getParameterRelativeTolerance() { return parRelativeTolerance; } /** * Gets the value of a tuning parameter. * @see #withOrthoTolerance(double) * * @return the parameter's value. */ public double getOrthoTolerance() { return orthoTolerance; } /** * Gets the value of a tuning parameter. * @see #withRankingThreshold(double) * * @return the parameter's value. */ public double getRankingThreshold() { return qrRankingThreshold; } Lombok will generate all of these. Eclipse can do the same thing, but if we delete one of the parameters, then the corresponding getter also has to be deleted. Also Lombok cuts down on the source code noise, since it is a byte code generator. The generated code does not appear in the source. Lombok also has a @Builder annotation that can be used to generate a inner static builder class that provides a fluent construction API. So if we break off the LevenbergMarquardtOptimizer configuration into its own class, and generate all the getters and the fluid API, there should be substantial code reduction. Gilles is also working on a snapshot capability for neural nets, and the @Synchronized annotation could come in handy here. These are the items I have looked at so far. >From my point of view, lombok can help developers by taking over some tedious tasks, but this is quite irrelevant in the case of CM as the majority of work goes into algorithm design and verification rather than in writing getters/setters (which btw has pretty good IDE support). I agree that the majority of time goes into the design of the algorithm. For me personally, when I'm looking at code, and it has a ton of boilerplate, it does slow my productivity...just because of all the noise. I'm happy once I've gotten it as DRY as possible. The more boilerplate, the more reluctant we are going to be to undertake refactoring, and we will make more mistakes (At least I will :) ). So this would just add additional complexity and the gain is very unclear. I think you will find it refreshing once you try it. At this point though I just wanted to float the idea. I'll complete the experiment and publish the result. At that point we will have a good baseline to gage whether adding it will add enough value to offset the cost of adding it. Well I know lombok. Super! Keep in mind that it is a bit more difficult to integrate it into our build-chain. As you probably know, in order to generate proper javadoc, you need to use delombok first to create source files which can be used for the javadoc process. I was thinking about that too actually. Here's how I was thinking it could be done in the case of the LevenbergMarquardtOptimizer. First split the configuration piece off. Then javadoc the properties of the configuration only. In the class header explain that the boilerplate has been generated using Lombok and put a reference to it there. If someone is smart enough to figure out how to work with the optimizer, then this should be trivial. So to summarize - move to a process of providing minimal javadoc and zero boilerplate. Explain to users that Lombok is used to facilitate this,
Re: [Math] Utilitzation of SLF4J?
On 09/25/2015 11:34 AM, Phil Steitz wrote: I disagree. Good tests, API contracts, exception management and documentation can and should eliminate the need for cluttering low-level library code with debug logging. Logging could be viewed as clutter. Constructed the right way, the logging statements could also be viewed as comments. I agree that good tests, API contracts, and in general keeping the design as minimalistic and simple as possible should be the first criteria to review before introducing logging. When the code is trivial I leave logging out, unless I need to put in a few statements in order to track collaboration between components in an algorithm. Other times the method is as simple as it gets and I have to have logging. For example take the LevenbergMarquardtOptimizer optimize() method. It's atomic in the sense that what's in the method belongs in the method. There are several loops within the main loop, etc. and tracking what's going, even with a observer being notified on each increment, would be far from elegant. For example perhaps we want to see what's going on with the parameters in this small (5% of the method size) section of code: // compute the scaled predicted reduction // and the scaled directional derivative for (int j = 0; j < solvedCols; ++j) { int pj = permutation[j]; double dirJ = lmDir[pj]; work1[j] = 0; for (int i = 0; i <= j; ++i) { work1[i] += weightedJacobian[i][pj] * dirJ; } } If there is something wrong with this in production, the shortest path to figuring that out is reviewing a trace. The longer path is to stop the server. Grab the data. Open up a debugger. Run the data. If we wanted to observe this section of code the observer would be looking at sub loops of of the optimize method. So that's doable, but it creates an interface design for the observer that's cluttered with events corresponding to various low level algorithm details. Several times, I've been obliged to create a modified version of CM to introduce "print" statements (poor man's logging!) in order to figure out why my code did not do what it was supposed to. It's pretty tragic that anyone of us should have to do this. It's also wasteful, because if Gilles has to do this, then there's a good chance that others have to do it to. The reason Tomcat logs at various levels is so that we can see what's going on in production and track down bugs. Lets become one with the logging and make it into something that strengthens both code integrity and comprehensibility. Everyone take take out your Feng Shui mat and do some deep breathing right now. Here again, tests, good design, code inspection are the way to go in low-level components. I have also spent a lot of time researching bugs in [math], other Commons components and other complex systems. My experience is a little different: excessive logging / debug code for code that contains it often just tends to get in the way, It's a very good point. Sometimes we come across logging statements that are just noise. Given how rigorous we are about reviewing everything though, I think a win win would be to limit the noise during code review, and pledge not to "Call our mom" in the middle of the code, etc. especially after it has rotted a bit. Rather than adding more code to maintain so that you can have less conceptual control over the functional code, it is better, IMNSHO, to focus on making the functional code as simple as possible with clean well-documented, test-validated API contracts. Totally agree. I think this should be the first priority, and that logging should be used when there are no simple clean alternatives. It also makes a code easier to debug while developing or modifying it (without resorting to poor man's logging, then deleting the "print", then reinstating them, then deleting them again, ad nauseam). Pretty sure we have all been here. Cheers, - Ole Gilles [1] No quality or complexity judgment implied. Phil Gilles Thomas On Fri, Sep 25, 2015 at 3:17 PM, Ole Ersoy <ole.er...@gmail.com> wrote: Hello, We have been discussing various ways to view what's happening internally with algorithms, and the topic of including SLF4J has come up. I know that this was discussed earlier and it was decided that CM is a low level dependency, therefore it should minimize the transitive dependencies that it introduces. The Java community has adopted many means of dealing with potential logging conflicts, so I'm requesting that we use SLF4J for logging. I know that JBoss introduced its own logging system, and this made me a bit nervous about this suggestion, so I looked up strategies for switching their logger out with SLF4J: http://stackoverflow.
[Math] Utilitzation of SLF4J?
Hello, We have been discussing various ways to view what's happening internally with algorithms, and the topic of including SLF4J has come up. I know that this was discussed earlier and it was decided that CM is a low level dependency, therefore it should minimize the transitive dependencies that it introduces. The Java community has adopted many means of dealing with potential logging conflicts, so I'm requesting that we use SLF4J for logging. I know that JBoss introduced its own logging system, and this made me a bit nervous about this suggestion, so I looked up strategies for switching their logger out with SLF4J: http://stackoverflow.com/questions/14733369/force-jboss-logging-to-use-of-slf4j The general process I go through when working with many dependencies that might use commons-logging instead of SLF4J looks something like this: http://stackoverflow.com/questions/8921382/maven-slf4j-version-conflict-when-using-two-different-dependencies-that-requi With JDK9 individual modules can define their own isolated set of dependencies. At this point the fix should be a permanent. If someone has has a very intricate scenario that we have not yet seen, they could use (And probably should use) OSGi to isolate dependencies. WDYT? Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[Math] Utilization of Lombok
Hello, I'm going to utilize Lombok in a CM design experiment. Once the experiment is done CM can decide if it likes Lombok. I know that CM tries to stay dependency free, so I just want to make clear that Lombok is compile time only: http://stackoverflow.com/questions/6107197/how-does-lombok-work Lombok eliminates the need to code boilerplate plate code, like getters, setters, toString(). It can also generate a fluid builder for configuration objects, check for null arguments, etc. It also has an @Synchronized annotation that is an improvement on the synchronized keyword. Lombok alters the byte code, keeping the source code clean and minimal. The additional generated code and be seen using an Eclipse plugin. So for example when looking at the outline view, you can see the generated getters, etc. https://standardofnorms.wordpress.com/2013/05/10/reducing-java-boilerplate-code-with-lombok-with-eclipse-installation/ Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] LeastSquaresOptimizer Design
On 09/24/2015 06:31 AM, luc wrote: Le 2015-09-24 04:16, Ole Ersoy a écrit : On 09/23/2015 03:09 PM, Luc Maisonobe wrote: CM is not intended to be a design pattern people should mimic. We are so bad at this it would be a shame. No one in its right mind would copy or reuse this stuff. It is for internal use only and we don't even have the resources to manage it by ourselves so we can't consider it as a path people should follow as we are leading them. Here we would be leading them directly against the wall. Hehe - I think that's like Michael Jordan saying - "Guys, don't try to be like me. I just play a little ball. Dunk from the free throw line. Six world championships, but THATs it!". In any case, I really appreciate you and Gilles taking the time to talk. Luc (And possibly Gilles) - I can actually see why you are getting a bit annoyed, because I'm ignoring something important. I've been doing 90% NodeJS stuff lately (Which is event loop based and relies callbacks) so I forgot one very important thing that I think you have both tried to tell me. The exception undoes the current callstack / breaks the current program flow, bubbling up to the handler. Thts a good point. OK - So scratch the callback thinking for synchronous code. The Lombok stuff should still be good though and hopefully some of the callback discussion around and asynchronous option - I hope! Geez. What do you think about having one exception per class with an Enum that encodes the various types of exceptional conditions that the class can find itself in? So in the case of LevenbergMarquardtOptimizer there would be a: - LevenbergMarquardtOptimizerException: - LevenbergMarquardtOptimizerExceptionEnum When the exception is thrown it sets the Enum indicating the root cause. The enum can then be used as a key to lookup the corresponding message. Any better? Sure. I would suggest adding some parameters to help the upper level formatting a meaningful message (say the number of iterations performed if you hit a max iteration, so users become aware they should have set the limit higher). Nothing over-engineered, a simple Object[] that can be used as last argument to something like String.format() would be enough. Brilliant - I'll setup a repository and start experimenting. Thanks again, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] LeastSquaresOptimizer Design
Hi Luc, I gave this some more thought, and I think I may have tapped out to soon, even though you are absolutely right about what an exception does in terms bubbling execution to a point where it stops or we handle it. Suppose we have an Optimizer and an Optimizer observer. The optimizer will emit three different events given in the process of stepping through to the max number of iterations it is allotted: - SOLUTION_FOUND - COULD_NOT_CONVERGE_FOR_REASON_1 - COULD_NOT_CONVERGE_FOR_REASON_2 - END (Max iterations reached) So we have the observer interface: interface OptimizerObserver { success(Solution solution) update(Enum enum, Optimizer optimizer) end(Optimizer optimizer) } So if the Optimizer notifies the observer of `success`, then the observer does what it needs to with the results and moves on. If the observer gets an `update` notification, that means that given the current [constraints, numbers of iterations, data] the optimizer cannot finish. But the update method receives the optimizer, so it can adapt it, and tell it to continue or just trash it and try something completely different. If the `END` event is reached then the Optimizer could not finish given the number of allotted iterations. The Optimizer is passed back via the callback interface so the observer could allow more iterations if it wants to...perhaps based on some metric indicating how close the optimizer is to finding a solution. What this could do is allow the implementation of the observer to throw the exception if 'All is lost!', in which case the Optimizer does not need an exception. Totally understand that this may not work everywhere, but it seems like it could work in this case. WDYT? Cheers, - Ole On 09/23/2015 03:09 PM, Luc Maisonobe wrote: Le 23/09/2015 19:20, Ole Ersoy a écrit : HI Luc, Hi Ole, On 09/23/2015 03:02 AM, luc wrote: Hi, Le 2015-09-22 02:55, Ole Ersoy a écrit : Hola, On 09/21/2015 04:15 PM, Gilles wrote: Hi. On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote: On 09/20/2015 05:51 AM, Gilles wrote: On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote: Wanted to float some ideas for the LeastSquaresOptimizer (Possibly General Optimizer) design. For example with the LevenbergMarquardtOptimizer we would do: `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);` Rough optimize() outline: public static void optimise() { //perform the optimization //If successful c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution); //If not successful c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE, diagnostic); //or c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE, diagnostic) //etc } The diagnostic, when turned on, will contain a trace of the last N iterations leading up to the failure. When turned off, the Diagnostic instance only contains the parameters used to detect failure. The diagnostic could be viewed as an indirect way to log optimizer iterations. WDYT? I'm wary of having several different ways to convey information to the caller. It would just be one way. One way for optimizer, one way for solvers, one way for ... Yes I see what you mean, but I think on a whole it will be worth it to add additional sugar code that removes the need for exceptions. But the caller may not be the receiver (It could be). The receiver would be an observer attached to the OptimizationContext that implements an interface allowing it to observe the optimization. I'm afraid that it will add to the questions of what to put in the code and how. [We already had sometimes heated discussions just for the IMHO obvious (e.g. code formatting, documentation, exception...).] Hehe. Yes I remember some of these discussions. I wonder how much time was spent debating the exceptions alone? Surely everyone must have had this feeling in pit of their stomach that there's got to be a better way. On the exception topic, these are some of the issues: I18N === If you are new to commons math and thinking about designing a commons math compatible exception you should probably understand the I18N stuff that's bound to exception (and wonder why it's bound the the exception). Not really true. Well a lot of things are gray. Personally if I'm dealing with an API, I like to understand it, so that there are no surprises. And I understand that the I18N coupling might not force me to use it, but if I want to be smart about my architecture, and simplify my design, then I should look at it. Maybe it is a good idea. Maybe I should just gloss over it? Am I being sloppy if I just gloss over it? Or is there an alternative that provides the same functionality, or maybe something better, that does not come with any of these side effects? The I18N was really simple at start. Yup I reviewed it and thought - it's probably no big deal - but as I started looking into reusing the CM exceptions, I decided
Re: [Math] LeastSquaresOptimizer Design
On 09/24/2015 03:23 PM, Luc Maisonobe wrote: Le 24/09/2015 21:40, Ole Ersoy a écrit : Hi Luc, I gave this some more thought, and I think I may have tapped out to soon, even though you are absolutely right about what an exception does in terms bubbling execution to a point where it stops or we handle it. Suppose we have an Optimizer and an Optimizer observer. The optimizer will emit three different events given in the process of stepping through to the max number of iterations it is allotted: - SOLUTION_FOUND - COULD_NOT_CONVERGE_FOR_REASON_1 - COULD_NOT_CONVERGE_FOR_REASON_2 - END (Max iterations reached) So we have the observer interface: interface OptimizerObserver { success(Solution solution) update(Enum enum, Optimizer optimizer) end(Optimizer optimizer) } So if the Optimizer notifies the observer of `success`, then the observer does what it needs to with the results and moves on. If the observer gets an `update` notification, that means that given the current [constraints, numbers of iterations, data] the optimizer cannot finish. But the update method receives the optimizer, so it can adapt it, and tell it to continue or just trash it and try something completely different. If the `END` event is reached then the Optimizer could not finish given the number of allotted iterations. The Optimizer is passed back via the callback interface so the observer could allow more iterations if it wants to...perhaps based on some metric indicating how close the optimizer is to finding a solution. What this could do is allow the implementation of the observer to throw the exception if 'All is lost!', in which case the Optimizer does not need an exception. Totally understand that this may not work everywhere, but it seems like it could work in this case. WDYT? With this version, you should also pass the optimizer in case of success. In most cases, the observer will just ignore it, but in some cases it may try to solve another problem, or to solve again with stricter constraints, using the previous solution as the start point for the more stringent problem. Another case would be to go from a simple problem to a more difficult problem using some kind of homotopy. Great - whoooh - glad you like this version a little better - for a sec I thought I had complete lost it :). Note to seeelf ... cancel therapy with Dr. Phil. BTW - Gilles - this could also be used as a light weight logger. The Optimizer could publish information deemed interesting on each ITERATION event. The observer could then be wired with SLF4J and perform the same type of logging that the Optimizer would perform. So CM could declare SLF4J as a test dependency, and unit tests could log iterations using it. Lombok also has a @SLF4J annotation that's pretty sweet. Saves the SLF4J boilerplate. Cheers, - Ole best regards, Luc Cheers, - Ole On 09/23/2015 03:09 PM, Luc Maisonobe wrote: Le 23/09/2015 19:20, Ole Ersoy a écrit : HI Luc, Hi Ole, On 09/23/2015 03:02 AM, luc wrote: Hi, Le 2015-09-22 02:55, Ole Ersoy a écrit : Hola, On 09/21/2015 04:15 PM, Gilles wrote: Hi. On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote: On 09/20/2015 05:51 AM, Gilles wrote: On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote: Wanted to float some ideas for the LeastSquaresOptimizer (Possibly General Optimizer) design. For example with the LevenbergMarquardtOptimizer we would do: `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);` Rough optimize() outline: public static void optimise() { //perform the optimization //If successful c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution); //If not successful c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE, diagnostic); //or c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE, diagnostic) //etc } The diagnostic, when turned on, will contain a trace of the last N iterations leading up to the failure. When turned off, the Diagnostic instance only contains the parameters used to detect failure. The diagnostic could be viewed as an indirect way to log optimizer iterations. WDYT? I'm wary of having several different ways to convey information to the caller. It would just be one way. One way for optimizer, one way for solvers, one way for ... Yes I see what you mean, but I think on a whole it will be worth it to add additional sugar code that removes the need for exceptions. But the caller may not be the receiver (It could be). The receiver would be an observer attached to the OptimizationContext that implements an interface allowing it to observe the optimization. I'm afraid that it will add to the questions of what to put in the code and how. [We already had sometimes heated discussions just for the IMHO obvious (e.g. code formatting, documentation, exception...).] Hehe. Yes I remember some of these discussions. I wonder how much time was spent debating the exceptions alone
Re: [Math] LeastSquaresOptimizer Design
On 09/24/2015 04:05 PM, Gilles wrote: On Thu, 24 Sep 2015 08:43:38 -0500, Ole Ersoy wrote: On 09/24/2015 06:31 AM, luc wrote: Le 2015-09-24 04:16, Ole Ersoy a écrit : On 09/23/2015 03:09 PM, Luc Maisonobe wrote: CM is not intended to be a design pattern people should mimic. We are so bad at this it would be a shame. No one in its right mind would copy or reuse this stuff. It is for internal use only and we don't even have the resources to manage it by ourselves so we can't consider it as a path people should follow as we are leading them. Here we would be leading them directly against the wall. Hehe - I think that's like Michael Jordan saying - "Guys, don't try to be like me. I just play a little ball. Dunk from the free throw line. Six world championships, but THATs it!". In any case, I really appreciate you and Gilles taking the time to talk. Luc (And possibly Gilles) - I can actually see why you are getting a bit annoyed, because I'm ignoring something important. I've been doing 90% NodeJS stuff lately (Which is event loop based and relies callbacks) so I forgot one very important thing that I think you have both tried to tell me. The exception undoes the current callstack / breaks the current program flow, bubbling up to the handler. Thts a good point. OK - So scratch the callback thinking for synchronous code. The Lombok stuff should still be good though and hopefully some of the callback discussion around and asynchronous option - I hope! Geez. What do you think about having one exception per class with an Enum that encodes the various types of exceptional conditions that the class can find itself in? So in the case of LevenbergMarquardtOptimizer there would be a: - LevenbergMarquardtOptimizerException: - LevenbergMarquardtOptimizerExceptionEnum When the exception is thrown it sets the Enum indicating the root cause. The enum can then be used as a key to lookup the corresponding message. Any better? Sure. I would suggest adding some parameters to help the upper level formatting a meaningful message (say the number of iterations performed if you hit a max iteration, so users become aware they should have set the limit higher). Nothing over-engineered, a simple Object[] that can be used as last argument to something like String.format() would be enough. Brilliant - I'll setup a repository and start experimenting. Thanks again, - Ole I don't understand what Luc proposed. But just having "Object[]" mentioned makes me shiver... :-{ Thanks to the "ExceptionContext" it is readily possible to add as many "messages" as we want to be displayed. [There is no need to ask the caller to use "format()" as it is done in CM.] And there are also a methods for setting and getting an "Object". I'd be for using more (possibly "local") exceptions if we want to convey more, and more specific, information. This should be done with getters that return typed information, not "Object"s. Javascripters do what Luc is advocating all the time, so I'm used to it. If the exception is specific to the class throwing the exception then we could attach a reference to the instance throwing the exception and use Lombok to generate binary getters. Cheers, - Ole Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Math] LeastSquaresOptimizer Design
Why should the instance throwing the exception hold a field with the information? Separation of concerns: optimizer does the computation, then the exception holds what's needed for a full report of the failure. I would see what makes sense on on case by case basis. For example if the Observer, which is implemented by the client / person using CM, realizes that it can't continue it can throw an application specific exception using a set of Enums that are coded for the application. if ( optimizer.yourNeverGonnaGetItEver() ) { throw new ApplicationSpecificException(ApplicationErrorCodes.PARTYS_OVER, optimizer); } The error code should be specific enough for the application to understand the the optimizer argument is the optimizer (Object type), and then it can construct the message from there. Or the report is done at the observer's level, based on complete information routinely returned by the optimizer at every step (cf. previous mail). I would stick with the same process in this case. Cheers, - Ole - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org