Re: [Math] Cleaning up the curve fitters

Phil Steitz Thu, 18 Jul 2013 14:17:40 -0700

On 7/18/13 1:48 PM, Ajo Fod wrote:
> Hello folks,
>
> There is a lot of work in API design. However, Konstantin's point is that
> it takes a lot of effort to convince Gilles of any alternatives. API design
> issues should really be second to functionality. This idea seems to be lost
> in conversations.


With patience and collaboration you can have both and we *need* to
have both.  You can't get to a stable API and approachable and
maintainable code base without thinking carefully about API design.
>
> I agree with Gilles that providing tests and benchmarks that exhibit the
> advantages of a particular method are probably the best way to show other
> contributors the value of an alternative approach.

There is some value to this, but honestly much more value in
carefully researching and presenting the numerical analysis to
support improvement / performance claims.
>
> It is quite depressing to the contributor to see one's contribution be
> rejected when efficiency/accuracy improvements are demonstrated. 

What you "demonstrated" in one case was better performance in one
problem instance.  The change of variable approach you implemented
was, in my admittedly possibly naive numerics view, questionable.  I
asked to see numerical analysis support and no one provided that. 
Had you provided that, I would have argued to include some version
of the patch.

> In a
> better world, rejecting a patch that passes the hurdle of demonstrating an
> efficiency improvement over existing code should come with a responsibility
> of showing alternate tests that the patch fails and the original code
> passes. Otherwise, the patch should be accepted by default. The person who
> commits or designed the API is free to make changes to fit API design.

This is essentially what Gilles ended up doing.  You may not agree
with the approach, but he did in fact address the core issue.
>
> Just like API designers are not experts at the underlying math,
> contributors are not necessarily experts at the underlying API design. To
> unlock the efficiency of open source, contributor morale needs to be
> considered and classes that pass tests should really be accepted.

I agree that we should try to be friendly and encouraging and I
apologize if we have not been so.  That said, the process of
contributing here is not just tossing patches over the wall.  First
you need to get community support for the ideas.  Then work
collaboratively to get patches that work for the code and community.
>
> For example, Performance AND accuracy improvements to existing algorithm
> were demonstrated for AdaptiveQuadrature in my patches to MATH-995

Sorry, I was not convinced by the accuracy and performance claims
and, as I said above, I suspect that the change of variable approach
may not be the best way to handle improper integrals.  I am not
claiming authority here - just - again - asking for real numerical
analysis arguments to support the claims you are making.

It would be a lot better if we focused discussion on the actual
technical issues and mathematical principles rather than
generalities about how hard / easy it is to get stuff in.

Phil
> The only joy I got out of that was Gilles putting a comment in the docs for
> the existing class:
> "The Javadoc now draws attention that the [existing] algorithm is not 100%
> fool-proof."!
> Also, I was asked to open a new issue about Adaptive Quadratures to figure
> out what is the best quadratue method ... all while a patch that is a clear
> improvement over existing code wastes away. Why not accept the patch and
> make improvements as necessary?
>
> My impression since that patch was rejected, is that it just seems like a
> huge hurdle to get any patch past the API design requirements that are
> frankly not as clear to me as it is to the designer. I can see how others
> feel the same way.
>
> Cheers,
> Ajo.
>
> Gilles: if you don't want to end up spending time developing Gauss-Hermite
> quadrature or something else you don't really need, perhaps you should
> consider accepting/modifying code that was shown to work by someone who
> needed that functionality. It is reasonable to develop alternatives to fix
> flaws/gaps, but otherwise its your effort wasted.  If someone's
> contribution doesn't fit your view of the API, then by all means edit the
> patch, but if you go about rejecting things that work, there won't be as
> many contributors to CM.
>
>
>
>
>
>
> On Thu, Jul 18, 2013 at 10:08 AM, Roger L. Whitcomb <
> roger.whitc...@actian.com> wrote:
>
>> As an outsider listening to these discussions, it seems like:
>> a) *IF* there are problems with the current arrangement of packages, APIs,
>> or whatever, then a constructive approach would be for the one who sees
>> such problems to take the time to not just criticize and point out "flaws",
>> but to dig in and rearrange the packages, redo the APIs, provide unit
>> tests, and submit a patch with these changes, along with quantitative
>> justification, benchmarks, test cases, etc.  It is quite easy to criticize,
>> from the sidelines, the one who is actually doing the work, but quite
>> another matter to roll up your sleeves and join in the work....
>> b) Since Math is a "library", it seems like there needs to be
>> implementations of many different algorithms, since (quite clearly) not
>> every algorithm is suited to every problem.  To say that X method doesn't
>> work well for problem Y, is not necessarily a reason to rewrite X method,
>> if that method is correctly implementing the algorithm.  Maybe the
>> algorithm is simply not the right one to use for the problem.
>> c) Comments that imply (or state outright) that someone who has (clearly)
>> done a lot of work has done it "...without much thinking..." are clearly
>> out of line.  In my experience, the only reason to resort to name calling
>> and character assassination is because one has no worthy arguments to put
>> forward.
>> d) Kudos to the Commons committers who have been doing the work ...
>>
>> My 2 cents...
>>
>> ~Roger Whitcomb
>> Apache Pivot PMC Chair
>>
>> -----Original Message-----
>> From: Gilles [mailto:gil...@harfang.homelinux.org]
>> Sent: Thursday, July 18, 2013 9:35 AM
>> To: dev@commons.apache.org
>> Subject: Re: [Math] Cleaning up the curve fitters
>>
>> On Thu, 18 Jul 2013 11:47:03 -0400, Konstantin Berlin wrote:
>>> I appreciate the comment. I would like to help, but currently my
>>> schedule is full. Maybe towards the end of the year.
>>>
>>> I think the first approach should be do no harm. The optimization
>>> package keeps getting refactored every few months without much
>>> thinking involved. We had the discuss previously, with Gilles
>>> unilaterally deciding on the current tree, which he now wants to
>>> change again.
>> As I said,
>> as Luc said,
>> as Phil said,
>> again and again and again,
>> we are not optimization (as a scientific field) experts here, but we do
>> use Commons Math in scientific code that is pretty compute intensive (and
>> yes, maybe not in the same sense as you'd like it to be for your comfort).
>> Current code has, and may still have problems, but we see them only
>> through running unit tests, running our applications, running code examples
>> submitted by issue reporters.
>> We improve what we can, given time and motivation constraints.
>> Other than that, there is nothing.
>>
>> Yes, we already had that asymmetrical conversation where _you_ declare
>> what _we_ should do.
>>
>>> As someone who uses optimization regular I would say the current API
>>> state (not just package naming) leaves a lot to be desired, and is not
>>> amenable to the various modification that people might need for larger
>>> problems. So if you are going to modify it, you should at least open
>>> up the API to the possibility that different optimization steps can be
>>> done using various techniques, depending on the problem.
>>>
>>> We should also accept that not everything can fit neatly into a
>>> package tree and a single set of APIs. A good example is least
>>> squares. Linear least squares does not require an initial guess at a
>>> solution, and by performing decomposition ahead of time you can
>>> quickly recompute the solution given different input values. However,
>>> an iterative least squares method might not have these properties.
>>> There are probably countless of other examples.
>>>
>>> Because optimization problems are really computationally hard all the
>>> little specific differences matter, that is why Gilles approach of
>>> sweeping everything under the rug and into some rigid not thought out
>>> hierarchical API forces these methods to adapt (or drop) numerical
>>> aspects that should not be there (e.x. polynomial fits). This has
>>> *huge* performance implications, but the issue is treated as some OO
>>> design 101 class, with the focus on how to force everything into a
>>> simple inheritance structure, numerics be damned.
>>>
>>> I would gladly help with the feedback when I can. Ajo and I provided
>>> code for adaptive integration, yet the whole issue was completely
>>> ignored. So I am not sure how much effort is required for the
>>> developers to take an idea or mostly completed code and make a change,
>>> rather than reject even the most basic numerical approaches that are
>>> taught in introduction classes as something that needs to be
>>> benchmarked.
>> As usual, you are mixing everything, from algorithms to implementations,
>> from proposing new features to denigrating existing ones (with non-existent
>> or inappropriate use-cases), from numerical to efficiency considerations...
>> [On top of it, you blatantly affirm that this issue has been ignored, even
>> as I provided[1] an analysis[2] of what was actually happening.
>> People like you seem to ignore that we work benevolently on this project!]
>> Not even speaking of derogatory remarks like "sweeping [...] under the rug"
>> and "not thought out" and insinuating that everything was better and more
>> efficient before. Which is simply not true.
>>
>> It's an asymmetrical discussion because you declare that half-baked code
>> is good enough and _we_ have to work even more than if we'd have to
>> implement the feature from scratch.
>>
>>
>> Gilles
>>
>> [1] In the spare time I do _not_ have either.
>> [2] Which dragged me to the implementation of the Gauss-Hermite quadrature
>>      scheme (although I had no personal use of it), which seems to be the
>>      appropriate way to deal with the improper integral reported in the
>>      issue which you refer to.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] Cleaning up the curve fitters

Reply via email to