(2) Considerations


a.) Is consistent library design important?Can all these models
interace effectively? Are all these different design models required? Is there a single design model that can span the entire library?

IMHO, the most important considerations are 1) how easy the library will be to navigate and use 2) how maintainable it will be and 3) how well it will perform (including resource requirements). For all of these, I think that it is best to look at practical application use cases -- e.g., if someone wants to solve a linear system or estimate a bivariate regression model, how easy will it be for them to do that using commons-math? How well will the solution perform and scale? How easy will it be for us to extend it? Since the library does different kinds of things to satisfy some fundamentally different use cases (e.g. generating a random sample vs modelling algebraic operations on a real matrix) it is natural that multiple design patterns and implementation strategies are used. Trying to force all commons-math components into a single abstract model is not necessary, IMHO, and would likely make the library harder to use and maintain.



b.) Which design strategy is of more interest to the project? Small compact classes used to complete one specific task, where the programming is primarily OOD in nature? or Feature rich interfaces that encapsulate full ranges of functionality, where programming is primarily "procedural" in nature?

Here again, my opinion is that this should be determined by the practical problem being addressed.



c.) What is the platform of interest for [math]? Server Side or Application.

We should certainly not have to choose here -- nor should our users. Nothing in the current implementation would create problems in server-side applications -- at least nothing that I can see. If others can see problems, we need to identify and address these specifically. The most important thing to do here is to clearly document interfaces so that users know what is stateful, what is not, what is thread-safe, what is not, what creates singletons, pools or external storage (nothing so far), etc.



d.) Should static method utilities be avoided at all costs in both cases? OR are there specific situations were static functions do not create garbage collection considerations and issues (like, when only returning primitives).

I am starting to think that we should avoid static methods, and in fact change StatUtils to require instantiation, but this has nothing to do with garbage collection, since with a stateless collection of static methods, there is no "garbage" to collect -- just a class loaded once per JVM. As long as there is no state associated with the class, I don't see how classloader problems could really come into play (unless users were relying on classloader priority to load different versions, which is IMHO a bad idea and could apply to any of our classes). The real issue here is extensibility. As I think more about the use cases for StatUtils, I am less convinced than I was before that the "convenience" and "efficiency" of not having to create an instance is worth the anxiety about support for extensibility. Therefore, I would support changing the methods in StatUtils to be non-static.




(3.) A couple proposals:

(i.) Brent and Pietschmann can you make suggestions/recommendations as to how your "function object" model could be applied to StaticUtil situations? Are you familiar with the Functors project and is there a possibility that they should be considered as the basic design strategy or base of implementation for your "Function Object" design? if they are a Commons project as well is there a possible dependency here we could take advantage of?

My opinion here is that a univariate real function is an object in its own right. I suppose that it could extend Functor, but I do not see the point in this and I would personally not like having to lose the typing and force use and casting to Double to implement
Object evaluate(Object obj);


It should also be noted that this is a relatively trivial part of what is really going on in the analysis package (i.e. rootfinding and spline fitting).


(ii.) Robert, can you provide any information that relates to (d.) above? And if there are such cases where static utils are ok in a server env when there are significant garbage collection concerns?


(iii.) All, can we consider that there is consistent approach to dealing with Equation like math evaluations, whether matrix, statistical or numeric. Finding and defining such a consistency will enhance the "plug-n-play" capabilities of the library. Providing both a means to easily learn, use and combine functionalities across various parts of the library.



Repeat comments above. I do not believe that there is a "one size fits all" nor that trying to force everything into a single pattern (which I find hard to imagine) will make things easier. Remember that many (most?) users will come to commons-math with a specific problem in mind and they will not likely want to invest large amounts of time in learning the "commons-math design philosophy". If we want to meet the goals in the proposal, we will want to make things as simple and "natural" to users as possible. Obviously, the 10000 <favorite currency> question is what is most "natural" for each functional area in commons-math.


Another point that we need to keep in mind is that we have a naturally layered structure, which will become even more so over time. We should be liberal in exposing technical functionality that "most users" will not use and the mathematical orientation of the interfaces will naturally increase as you go deeper into the infrastructure. For example, Brent did the hard work to derive and implement some special functions that reside in the special package. These were the key to providing the statistical testing/confidence intervals that "more users" may use. "Most users" will not use the special functions directly -- but it is very nice to have them exposed for the mathematical programmers who want to exploit their many uses beyond what we have used them internally for. Moving up the layers, "most users" will not use the Chi-Square distribution directly (which builds on special); but that is also very nice to have. Continuing up the call stack leading to the stats tests, we come to rootfinding, which more users will use directly and finally to the statistical tests and confidence intervals, which will likely be used directly quite a bit by people with no understanding or interest in either rootfinding or special functions. At each of the layers, a different level of mathematical sophistication is expected and different kinds of interfaces are "natural".

Finally, I think that it is appropriate to in some cases expose what amounts the the "same functionality" via multiple different kinds of interfaces. For example, to get the mean of a collection of doubles, you can now a) use StatUtils if what you have is an array of doubles and all you want is the mean b) instantiate a storage-less Univariate and feed the values in (good for long lists of values that you don't want to store in memory) or c) if the numbers whose mean you want happen to be exposed as properties of a collection of beans, instantiate a BeanListUnivariate and use it to get the mean. I see absolutely nothing wrong with this and in fact a lot that is "right" with this -- practical use cases drive design and the result is flexibility, convenience and ease of use.

Phil



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to