Brent Worden wrote:
-----Original Message-----
From: Phil Steitz [mailto:[EMAIL PROTECTED]
Sent: Friday, June 06, 2003 12:21 PM
To: [EMAIL PROTECTED]
Subject: [math] proposed ordering for task list, scope of initial
release


Here is a *proposed* ordering for the task list, with a little commentary added.

One thing that I want to make *very* clear up front, is that I
*never* intended
the task list or the items listed in the scope section of the
proposal to be
definitive.  All that is definitive are the guiding principles,
which just try
to keep us focused on stuff that people will find both useful and
easy to use.
I expected that the actual contents of the first release would
include some
things not on the list and would exclude some of the things
there.  At this
stage, as Jouzas pointed out, it is more important for us to
build community
than to rush a release out the door. So if there are things that fit the
guidelines that others would like to contribute, but which are
not on the list,
*please* suggest them.  Also, for those who may not have dug into
the code, but
who may be interested in contributing, please rest assured that deep
mathematical knowledge is not required to help. We can review
implementations
and deal with mathematical problems as they arise, using our
small but growing
community as a resource.  The same is obviously true on the the
Java/OS tools
side -- no need to be an expert to contribute.

OK, long-winded disclaimer aside, here is how I see the task list ordered:

* The RealMatrixImpl class is missing some key method implementations. The
critical thing is solution of linear systems. We need to implement a
numerically sound solution algorithm. This will enable inverse() and also
support general linear regression. -- I think that Brent is
working on this.


The only thing I've done is the Cholesky decomposition.  I haven't done
anything for the general linear system case.

Are you going to do this, or should I take it on?

* t-test statistic needs to be added and we should probably add
the capability
of actually performing t- and chi-square tests at fixed
significance levels
(.1, .05, .01, .001). -- This is virtually done, just need to
define a nice,
convenient interface for doing one- and two-tailed tests.  Thanks
to Brent, we
can actually support user-supplied significance levels (next item)


Anyone have any thoughts on the interface?  I was thinking of an Inference
interface that supports the conducting of one- and two-tailed tests as well
as constructing their complementary confidence intervals.  Or, if we want to
separate concerns create both a HypothesisTest and a ConfidenceInterval
interface, one for each type of inference.  Either way, I would use the
tried-and-true abstract factory way of creating inference instances.
Comments are welcome.


* numerical approximation of the t- and chi-square distributions to enable
user-supplied significance levels.  See above.  Someone just
needs to put a
fork in this. Tim? Brent?


Done.

Including the testing interface? See below.


* *new* add support for F distribution and F test, so that we can report
signinficance level of correlation coefficient in bivariate regression /
signinficance of model.  I will do this if no one else wants to.


Done.  I'll probably knock out a few more easy continuous distributions to
get them out of the way.


* Framework and implementation strategie(s) for finding roots or
real-valued
functions of one (real) variable.  Here again -- largely done.  I
would prefer
to wait until J gets back and let him submit his framework and R. Brent's
algorithm.  Then "our" Brent's implementation and usage can be integrated
(actually not much to do, from the looks of the current code) and
I will add my
"bean equations" stuff (in progress).


Sounds good.


* Extend distribution framework to support discrete distributions
and implement
binomial and hypergeometric distributions.  I will do this if no
one else wants
to.  If someone else does it, you should make sure to use the log
binomials in
computations.


Binomial can easily be obtained using the regularized beta function that is
already defined.  Hypergeometric will be a little more work as I don't think
there's a compact formula to compute the cpf.

Using the log binomials, direct computation of the density might not be too bad. I have not researched this, but that is what I was thinking.


One thing to note, since the
discrete distributions do not have nice invertible mappings for critical
values to probabilities like those found for continuous distributions, how
should the inverseCummulativeProbability method work?  For a given
probability, p, should the method return one value, x, such that x is the
largest value where P(X <= x) <= p?  Or the smallest value, x, where P(X <=
x) >= p.  Or should the method return two values, x0 and x1, such that P(X
<= x0) <= p <= P(X <= x1)?

I think in the discrete case, we should supply the density function (and the cumulative probability function) and probably omit the inverseCumulativeProbability method. If we were to add it, I would use the second of your alternatives above.




* Exponential growth and decay (set up for financial
applications) I think this
is just going to be a matter of finding the right formulas to add
to MathUtils.
I don't want to get carried away with financial computations,
but some simple,
commonly used formulas would be a nice addition to the package.
We should also
be thinking about other things to add to MathUtils -- religiously
adhering to
th guiding principles, of course.  Al's sign() is an excellent
example of the
kind of thing that we should be adding, IMHO.


Things that might be added:
Average of two numbers comes up a lot.

Yes. Some (of us) might not like the organization of this; but I have a couple of times posted the suggestion that we add several
double[]->double functions to MathUtils representing the core computations for univariate -- mean, min, max, variance, sum, sumsq. This would be convenient for users and us as well. I guess I would not be averse to moving these to stat.StatUtils, maybe just adding ave(x,y) to MathUtils.


Given the post that I just saw regarding financial computations, I suggest that we let MathUtils grow a bit (including the double[]->double functions and then think about breaking it apart prior to release. As long as we stick to simple static methods, that will not be hard to do.

Something similar to JUnit's assertEquals(double expected, double actual,
double epsilon).

Good idea


Simple methods like isPositive, isNegative, etc. can be used to make boolean
expressions more human readable.

I agree


Some other constants besides E and PI: golden ratio, euler, sqrt(PI), etc.
I've used a default error constant several places.

I get the first 3, but what exactly do you mean by the default error constant?


It would be nice to come
up with a central location for such values.


In addition to the above, has any thought gone into a set of application exceptions that will be thrown. Are we going to rely on Java core exceptions or are we going to create some application specific exceptions? As I recall J uses a MathException in the solver routines and I added a ConvergenceException. Should we expand that list or fold it into one generic application exception or do away with application exceptions all together?

My philosophy on this is that whatever exceptions we define should be "close" to the components that throw them -- e.g. ConvergenceException. I do not like the idea of a generic "MathException." As much as possible, I think that we should rely on the built-ins (including the extensions recently added to lang). Regarding ConvergenceException, I am on the fence for inclusion in the initial release, though I see something like this as eventually inevitable. Correct me if I am wrong, but the only place that this is used now is in the dist package and we could either just throw a RuntimeException directly there or return NaN. I do see the semantic value of ConvergenceException, however. I guess I would vote for keeping it.


Brent Worden http://www.brent.worden.org


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to