On 1/16/15 2:09 AM, Thomas Neidhart wrote:
> On 01/16/2015 01:30 AM, Gilles wrote:
>> On Thu, 15 Jan 2015 15:41:11 -0700, Phil Steitz wrote:
>>> On 1/15/15 2:24 PM, Thomas Neidhart wrote:
>>>> On 01/08/2015 12:34 PM, Gilles wrote:
>>>>> Hi.
>>>>>
>>>>> Raising this issue once again.
>>>>> Are we going to upgrade the requirement for the next major release?
>>>>>
>>>>   [ ] Java 5
>>>>   [x] Java 6
>>>>   [x] Java 7
>>>>   [ ] Java 8
>>>>   [ ] Java 9
>>>>
>>>> A while ago I thought that it would be cool to switch to Java 7/8 for
>>>> some of the nice new features (mainly fork/join, lambda expressions and
>>>> diamond operator, the rest is more or less unimportant for math imho).
>>>>
>>>> But after some thoughts I think they are not really needed for the
>>>> following reasons:
>>>>
>>>>  * the main focus of math is on developing high-quality, well tested and
>>>> documented algorithms, the existing language features are more than
>>>> enough for this
>> Sure.
>> Not so long ago, some people were claiming that nothing beats
>> programming in "assembly" language.
>>
>>> +1
>>>>  * coming up with multi-threaded algorithms might be appealing but it is
>>>> also hard work and I wonder if it really makes sense in the times of
>>>> projects like mahout / hadoop / ... which aim for even better
>>>> scalability
>>> +1
>> Hard work / easy work.  Yes and no.  It depends on the motivation
>> of the contributor. Or we have to (re)define clearly the scope of
>> CM, and start some serious clean-up.
>> It's not all black or white; I'm quite convinced that it's better
>> to handle multi-threading externally when the core computation is
>> sequential.  But CM already contains algorithms that are inherently
>> parallel (a.o. genetic algorithms) and improvement in those areas
>> would undoubtedly benefit from (internal) parallel processing.
> I think the better approach is to support external parallelization
> rather than trying to do it yourself. From a user POV, I would be scared
> to use a library that does some kind of parallelization internally which
> I can not control.

+1
>
> Some recent examples show how it can be done better: there were some
> requests to make some of the statistics related classes map/reducable so
> that they can be used in Java 8 parallel streams.

+1 - mostly done.
>
> @genetic algorithms: there are far more better libraries out there for
> this area and the support we have in math is really very simplistic. You
> can basically do just a few demo examples with it and I am more in favor
> to deprecate the package.

Agreed there is better stuff out there, but I like the structure of
what we have (weak as the capabilities may be).  I have often
thought about playing with replacing the GeneticAlgorithm and
Population implementations with M/R-capable things.  I bet this
could be done without changing our API at all - just using the
lower-level constructs in a distributed execution environment.  I
have not actually done this so am not sure it would work; but I
don't see why not.  This still leaves gaps in encoding, etc; but
those could be filled over time.  I would be -0 on deprecating the
package, partly because I am a user of it :)

Phil
>
>>> My HO is we should focus on getting the best single-threaded
>>> implementations we can and, where possible, setting things up to be
>>> executed in parallel by other engines.  Spawning and managing
>>> threads internal to [math] actually *reduces* the range of
>>> applicability of our stuff.
>> Examples?
> because not everybody wants a library to do parallel stuff internally.
> Just imagine math being used in a web-application deployed together with
> many other applications. It is clearly not an option that one
> application might take over most/all of the available processors.
>
>>>  Much better to let Hadoop / Mahout et
>>> al parallelize using fast and accurate piece parts that we can
>>> provide.
>> Do they really do that?
>> [Or do they implement their own algorithms knowing that they must
>> be thread-safe (which is something we don't focus a lot on).]
> I guess they have mainly their own algorithms, but there are examples of
> our stuff being used (using the map/reduce paradigm).
>
>>>  If there are parallel algorithms that we are really dying
>>> to implement directly, I would rather see that done in a way that
>>> encapsulates and enables externalization of the thread management.
>>>>  * staying at Java 6/7 does not block users to use math in a Java 8
>>>> environment if wanted
>>> +1 - the examples I have seen thus far are all things that could be
>>> done fairly easily with client code.  I know we don't all agree with
>>> this, but I think the biggest service we can provide to our user
>>> base is good, tested, supported implementations of standard
>>> algorithms.  I wish we could find a way to focus more on that and
>>> less on fiddling with the API or language features.
> +1, I have the impressions that they more we try to *optimize* an API we
> end up with an inferior solution (with a few exceptions).
>
> There is too much discussion about API design. We should have our best
> practices and use them to implement rock-solid algorithms, which is
> already difficult enough. In the end it does not matter so much if you
> have a fluent API or whatever, as long as it calculates the correct
> result, and is easy to use, imho.
>
>> The problem is that those discussions constantly mix considerations
>> about contents, with political moves that do not necessarily match.
>> For example, a statement about contents would be: CM only provides
>> implementations of sequential mathematical algorithms.
>> But recent political moves, like changing the version control system
>> or advertizing "free for all" commit rights, aim at increasing the
>> contributor base.
> I think these considerations are orthogonal:
>
>  * what you want to do? aka scope of the projects
>  * how you want to do it?
>  * what infrastructure do you provide to your users/collaborators
>
>> What about those people interested in API fixing and new language
>> features?  You'll make them want to contribute to another project.
>> Now that Java is, at last, beginning to catch up with other
>> languages incomparably more widely used in the scientific community,
>> Commons Math is discussing how far behind it is going to lag!
> Afaik the scientific community uses mainly python with its abundance of
> great tools. I think Java is better suited in an engineering context.
>
> Thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to