On 01/16/2015 01:30 AM, Gilles wrote: > On Thu, 15 Jan 2015 15:41:11 -0700, Phil Steitz wrote: >> On 1/15/15 2:24 PM, Thomas Neidhart wrote: >>> On 01/08/2015 12:34 PM, Gilles wrote: >>>> Hi. >>>> >>>> Raising this issue once again. >>>> Are we going to upgrade the requirement for the next major release? >>>> >>> [ ] Java 5 >>> [x] Java 6 >>> [x] Java 7 >>> [ ] Java 8 >>> [ ] Java 9 >>> >>> A while ago I thought that it would be cool to switch to Java 7/8 for >>> some of the nice new features (mainly fork/join, lambda expressions and >>> diamond operator, the rest is more or less unimportant for math imho). >>> >>> But after some thoughts I think they are not really needed for the >>> following reasons: >>> >>> * the main focus of math is on developing high-quality, well tested and >>> documented algorithms, the existing language features are more than >>> enough for this > > Sure. > Not so long ago, some people were claiming that nothing beats > programming in "assembly" language. > >> +1 >>> >>> * coming up with multi-threaded algorithms might be appealing but it is >>> also hard work and I wonder if it really makes sense in the times of >>> projects like mahout / hadoop / ... which aim for even better >>> scalability >> >> +1 > > Hard work / easy work. Yes and no. It depends on the motivation > of the contributor. Or we have to (re)define clearly the scope of > CM, and start some serious clean-up. > It's not all black or white; I'm quite convinced that it's better > to handle multi-threading externally when the core computation is > sequential. But CM already contains algorithms that are inherently > parallel (a.o. genetic algorithms) and improvement in those areas > would undoubtedly benefit from (internal) parallel processing.
I think the better approach is to support external parallelization rather than trying to do it yourself. From a user POV, I would be scared to use a library that does some kind of parallelization internally which I can not control. Some recent examples show how it can be done better: there were some requests to make some of the statistics related classes map/reducable so that they can be used in Java 8 parallel streams. @genetic algorithms: there are far more better libraries out there for this area and the support we have in math is really very simplistic. You can basically do just a few demo examples with it and I am more in favor to deprecate the package. >> My HO is we should focus on getting the best single-threaded >> implementations we can and, where possible, setting things up to be >> executed in parallel by other engines. Spawning and managing >> threads internal to [math] actually *reduces* the range of >> applicability of our stuff. > > Examples? because not everybody wants a library to do parallel stuff internally. Just imagine math being used in a web-application deployed together with many other applications. It is clearly not an option that one application might take over most/all of the available processors. >> Much better to let Hadoop / Mahout et >> al parallelize using fast and accurate piece parts that we can >> provide. > > Do they really do that? > [Or do they implement their own algorithms knowing that they must > be thread-safe (which is something we don't focus a lot on).] I guess they have mainly their own algorithms, but there are examples of our stuff being used (using the map/reduce paradigm). >> If there are parallel algorithms that we are really dying >> to implement directly, I would rather see that done in a way that >> encapsulates and enables externalization of the thread management. >>> >>> * staying at Java 6/7 does not block users to use math in a Java 8 >>> environment if wanted >> >> +1 - the examples I have seen thus far are all things that could be >> done fairly easily with client code. I know we don't all agree with >> this, but I think the biggest service we can provide to our user >> base is good, tested, supported implementations of standard >> algorithms. I wish we could find a way to focus more on that and >> less on fiddling with the API or language features. +1, I have the impressions that they more we try to *optimize* an API we end up with an inferior solution (with a few exceptions). There is too much discussion about API design. We should have our best practices and use them to implement rock-solid algorithms, which is already difficult enough. In the end it does not matter so much if you have a fluent API or whatever, as long as it calculates the correct result, and is easy to use, imho. > The problem is that those discussions constantly mix considerations > about contents, with political moves that do not necessarily match. > For example, a statement about contents would be: CM only provides > implementations of sequential mathematical algorithms. > But recent political moves, like changing the version control system > or advertizing "free for all" commit rights, aim at increasing the > contributor base. I think these considerations are orthogonal: * what you want to do? aka scope of the projects * how you want to do it? * what infrastructure do you provide to your users/collaborators > What about those people interested in API fixing and new language > features? You'll make them want to contribute to another project. > Now that Java is, at last, beginning to catch up with other > languages incomparably more widely used in the scientific community, > Commons Math is discussing how far behind it is going to lag! Afaik the scientific community uses mainly python with its abundance of great tools. I think Java is better suited in an engineering context. Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org