Re: [Statistics] Convention when outside support?

2019-11-29 Thread Fran Lattanzio
Hi,

I was involved in a similar debate on a different project, and we came to the 
conclusion that (double -> double) methods in Java should return NaN in the 
case of invalid arguments, rather than throw Exceptions. 

Our reasoning was by analogy with how IEEE 754 floating-point exceptions are 
handled by Java. Obviously, the definition of a floating-point exception is 
quite different from a Java exception. But anyway, our question was, how should 
raising an exception in the floating-point world map to throwing an exception 
in Java? The core Java libraries effectively behave as if all floating-point 
traps are disabled*: Overflow results in an infinity, underflow in 
subnormal/zero, square root of negative returns NaN, etc.

Based on this, we decided that returning NaN is the “best” behavior, since this 
is what IEEE spec says to do when in the invalid operation flag is disabled.

Fran.

* = We did discuss having a kind of floating-point signal policy that would 
change the behavior from returning a default value to throwing a (Java) 
exception when these floating-point exceptions were detected. But this would be 
a complex implementation problem, not least because incorporating this into 
existing numerical libraries would be difficult to impossible.



> On Nov 29, 2019, at 11:48 AM, Gilles Sadowski  wrote:
> 
> Hello.
> 
> For all implemented distributions, what convention should be adopted
> when methods
> * density(x)
> * logDensity(x)
> * cumulativeProbability(x)
> are called with "x" out of the "support" bounds?
> 
> Currently some (but not all[1]) are documented to return "NaN".
> An alternative could be to throw an exception.
> 
> Regards,
> Gilles
> 
> [1] https://issues.apache.org/jira/projects/MATH/issues/MATH-1503
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[math] New finite differencing framework

2016-02-17 Thread Fran Lattanzio
Hello all,

(I tried to send this yesterday, but it looks like my message bounced,
as I received some weird errors; apologies if you are receiving a
duplicate of this message.)

I have created a pull request with a new univariate finite difference
framework for [math]. The main features of this framework are:
1. *Exact* calculation of the coefficients for any stencil type
(forward, backward, central), derivative order, and accuracy order.
This is done by solving the system of linear equations generated by
repeated Taylor expansion, using the field LU decomp over
BigFractions. This is actually an important feature because it ensures
there is no roundoff error in the coefficients; and we can tell if a
coefficient is *exactly* zero.
2. Separation of derivative calculation and bandwidth selection -
specifically, there is a strategy interface that decides the
bandwidth, based on point at which the derivative is desired (and the
nature of the stencil, etc).
3. Several canned bandwidth strategies:
 a. A fixed bandwidth.
 b. A "rule-of-thumb" strategy (which covers far more general cases
than the rule-of-thumb suggested in a certain numerical cookbook).
 c. An adaptive strategy that estimates the error scale of the finite
difference function in order to determine the optimal bandwidth. This
is done using a technique akin to Richardson extrapolation.
 d. A decorator strategy to round bandwidths to a power-of-two. (Using
powers-of-two bandwidths is good because they have exact IEEE floating
point representations, so x +/- h will be correct to full precision.)

Strategies b. and c. can also accept a function condition error
parameter. This is quite important if you know that the underlying
function is not computed to full machine precision, e.g. the function
is an approximation accurate to 1e-10; or done over the IEEE 32s
rather than 64s.

Bandwidth strategy c. is derived from a thesis published by Ravi
Mathur. Mathur provides formulae that shows how to find the optimal
bandwidth - optimal in the sense that it will minimize the total error
(which is a delicate balancing act!). His thesis can be found here:
https://repositories.lib.utexas.edu/bitstream/handle/2152/ETD-UT-2012-05-5275/MATHUR-DISSERTATION.pdf?sequence=1

(This is the first implementation of Mathur's ideas that I have seen.
Most of his work is thorough, so although this implementation is "new"
 and thus not considered an industry standard, I think it is
definitely worth including. Some initial numerical experiments confirm
as much.)

These strategies should cover the vast majority of use cases, and of
course the strategy interface allows users to implement anything
bespoke if the need arises.

I am working on generating more unit tests now. For now, there are
only relatively simple tests for the coefficient generation and
"integration tests" that compare analytical vs. numerical derivatives.
If this is something that the community wants to include in math4, I
will be happy to fully flesh them out.

I am also working on a multivariate version. Generating the
coefficients is easy... the tricky part for the multivariate case is
the bandwidth strategies. The fixed bandwidth strategy is obviously
trivial. I am trying to expand Mathur's work to the multivariate case
to derive analogous rule-of-thumb and adaptive strategies but the
equations are bit ticklish in the multivariate world.

In any case, the JIRA ticket and pull request are here:
https://issues.apache.org/jira/browse/MATH-1325
https://github.com/apache/commons-math/pull/24

Any suggestions are welcome,
Fran.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[math] New finite differencing framework for math4

2016-02-16 Thread Fran Lattanzio
Hello all,

I have created a pull request with a new univariate finite difference
framework for [math]. The main features of this framework are:
1. *Exact* calculation of the coefficients for any stencil type
(forward, backward, central), derivative order, and accuracy order.
This is done by solving the system of linear equations generated by
repeated Taylor expansion, using the field LU decomp over
BigFractions. This is actually an important feature because it ensures
there is no roundoff error in the coefficients; and we can tell if a
coefficient is *exactly* zero.
2. Separation of derivative calculation and bandwidth selection -
specifically, there is a strategy interface that decides the
bandwidth, based on point at which the derivative is desired (and the
nature of the stencil, etc).
3. Several canned bandwidth strategies:
 a. A fixed bandwidth.
 b. A "rule-of-thumb" strategy (which covers far more general cases
than the rule-of-thumb suggested in a certain numerical cookbook).
 c. An adaptive strategy that estimates the error scale of the finite
difference function in order to determine the optimal bandwidth. This
is done using a technique akin to Richardson extrapolation.
 d. A decorator strategy to round bandwidths to a power-of-two. (Using
powers-of-two bandwidths is good because they have exact IEEE floating
point representations, so x +/- h will be correct to full precision.)

Strategies b. and c. can also accept a function condition error
parameter. This is quite important if you know that the underlying
function is not computed to full machine precision, e.g. the function
is an approximation accurate to 1e-10; or done over the IEEE 32s
rather than 64s.

Bandwidth strategy c. is derived from a thesis published by Ravi
Mathur. Mathur provides formulae that shows how to find the optimal
bandwidth - optimal in the sense that it will minimize the total error
(which is a delicate balancing act!). His thesis can be found here:
https://repositories.lib.utexas.edu/bitstream/handle/2152/ETD-UT-2012-05-5275/MATHUR-DISSERTATION.pdf?sequence=1

(This is the first implementation of Mathur's ideas that I have seen.
Most of his work is thorough, so although this implementation is "new"
 and thus not considered an industry standard, I think it is
definitely worth including. Some initial numerical experiments confirm
as much.)

These strategies should cover the vast majority of use cases, and of
course the strategy interface allows users to implement anything
bespoke if the need arises.

I am working on generating more unit tests now. For now, there are
only relatively simple tests for the coefficient generation and
"integration tests" that compare analytical vs. numerical derivatives.
If this is something that the community wants to include in math4, I
will be happy to fully flesh them out.

I am also working on a multivariate version. Generating the
coefficients is easy... the tricky part for the multivariate case is
the bandwidth strategies. The fixed bandwidth strategy is obviously
trivial. I am trying to expand Mathur's work to the multivariate case
to derive analogous rule-of-thumb and adaptive strategies but the
equations are bit ticklish in the multivariate world.

In any case, the JIRA ticket and pull request are here:
https://issues.apache.org/jira/browse/MATH-1325
https://github.com/apache/commons-math/pull/24

Any suggestions are welcome,
Fran.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [math] Question regarding use of CM - who is using it?

2013-02-01 Thread Fran Lattanzio
Hi,

We use CM in an options analytics package. Some of the interesting usages are:
1. Implied vol calculations (a semi-fancy root find over function with
no closed forms).
2. Dividend projections.
3. Option greeks (partial derivatives of value function), both
automatic and numerical.
4. Non-linear model fitting (vol smile and surface models, for example).
5. Yield curve stripping and interpolation/extrapolation.
6. Non-parametric smoothing.

There are many more. CM is central to what the package does.

F.

On 1/31/13, Thomas Neidhart thomas.neidh...@gmail.com wrote:
 Hi,

 Gilles and myself will give a presentation on CM at FOSDEM this weekend
 and we would like to have a few more examples of how CM is currently used.

 I know Phil asked a similar question a few weeks ago, was there any
 feedback?

 Thanks,

 Thomas

 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [math] Numerical derivatives in Commons Math

2011-08-12 Thread Fran Lattanzio
I have a working prototype for real univariate functions that uses
finite differences. The code is very small and simple, but quite
messy. I'll clean it up soon. It's trivial to extend this to
univariate vectorial and matrix functions. I think it's also worth add
Savitzky-Golay smoothing filters for univariate functions for the
first pass. Computing SG coefficients is really easy:
http://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_smoothing_filter
Or see Numerical Recipes. (The implementation in NR is crazy, but
obviously correct. It could be written with CM objects in about 10
simple, clear lines; NR's uses about 50 lines of relative madness.)

Adding SG filters for the first pass is, I think, a reasonably good
idea since both SG and FD are basically convolution-type filters, so
we could develop some infrastructure for these. With it, we also use
other algos, such as the low-noise filters from Holoborodko mentioned
in a previous email, without much extra work.

I think the multivariate world should come later. It can be built atop
the univariate infrastructure for the most part, so
all-things-univariate should get nailed down before tackling the
multivariate side.

Shall I create a JIRA ticket and submit a patch for this?

Cheers,
Fran.



On Fri, Aug 12, 2011 at 7:42 AM, Patrick Meyer meyer...@gmail.com wrote:
 Thanks for the information Luc. I didn't know those existed. I'm happy to
 keep the discussion at the implementation levels.

 On 8/12/2011 6:23 AM, Luc Maisonobe wrote:

 Le 12/08/2011 00:30, Patrick Meyer a écrit :

 I like the idea of adding this feature. What about an abstract class
 that implements DifferentiableMultivariateRealFunction and provides the
 method for partialDerivative (). People could then override the
 partialDerivative method if they have an analytic derivative.

 Here's some code that I'm happy to contribute to Commons-math. It
 computes the derivative by the central difference meathod and the
 Hessian by finite difference. I can add this to JIRA when it's there.

 Hi Patrick,

 I think we need to discuss about the API we want and then about the
 implementation. There are already other finite differences implementations
 in some tests for both Apache Commons Math and a complete package in Apache
 Commons Nabla. Your code adds Hessian to this which is really a good thing.

 Thanks,
 Luc


 /**
 * Numerically compute gradient by the central difference method.
 Override this method
 * when the analytic gradient is available.
 *
 *
 * @param x
 * @return
 */
 public double[] derivativeAt(double[] x){
 int n = x.length;
 double[] grd = new double[n];
 double[] u = Arrays.copyOfRange(x, 0, x.length);
 double f1 = 0.0;
 double f2 = 0.0;
 double stepSize = 0.0001;

 for(int i=0;in;i++){
 stepSize = Math.sqrt(EPSILON)*(Math.abs(x[i])+1.0);//from SAS manual on
 nlp procedure
 u[i] = x[i] + stepSize;
 f1 = valueAt(u);
 u[i] = x[i] - stepSize;
 f2 = valueAt(u);
 grd[i] = (f1-f2)/(2.0*stepSize);
 }
 return grd;
 }

 /**
 * Numerically compute Hessian using a finite difference method. Override
 this
 * method when the analytic Hessian is available.
 *
 * @param x
 * @return
 */
 public double[][] hessianAt(double[] x){
 int n = x.length;
 double[][] hessian = new double[n][n];
 double[] gradientAtXpls = null;
 double[] gradientAtX = derivativeAt(x);
 double xtemp = 0.0;
 double stepSize = 0.0001;

 for(int j=0;jn;j++){
 stepSize = Math.sqrt(EPSILON)*(Math.abs(x[j])+1.0);//from SAS manual on
 nlp procedure
 xtemp = x[j];
 x[j] = xtemp + stepSize;
 double [] x_copy = Arrays.copyOfRange(x, 0, x.length);
 gradientAtXpls = derivativeAt(x_copy);
 x[j] = xtemp;
 for(int i=0;in;i++){
 hessian[i][j] = (gradientAtXpls[i]-gradientAtX[i])/stepSize;
 }
 }
 return hessian;
 }


 On 8/11/2011 5:36 PM, Luc Maisonobe wrote:

 Le 11/08/2011 23:27, Fran Lattanzio a écrit :

 Hello,

 Hi Fran,


 I have a proposal for a numerical derivatives framework for Commons
 Math. I'd like to add the ability to take any UnivariateRealFunction
 and produce another function that represents it's derivative for an
 arbitrary order. Basically, I'm saying add a factory-like interface
 that looks something like the following:

 public interface UniverateNumericalDeriver {
 public UnivariateRealFunction derive(UnivariateRealFunction f, int
 derivOrder);
 }

 This sound interesting. did you have a look at Commons Nabla
 UnivariateDifferentiator interface and its implementations ?

 Luc


 For an initial implementation of this interface, I propose using
 finite differences - either central, forward, or backward. Computing
 the finite difference coefficients, for any derivative order and any
 error order, is a relatively trivial linear algebra problem. The user
 will simply choose an error order and difference type when setting up
 an FD univariate deriver - everything else will happen automagically.
 You can compute the FD coefficients once the user invokes the function
 in the interface above (might be expensive

[math] Numerical derivatives in Commons Math

2011-08-11 Thread Fran Lattanzio
Hello,

I have a proposal for a numerical derivatives framework for Commons
Math. I'd like to add the ability to take any UnivariateRealFunction
and produce another function that represents it's derivative for an
arbitrary order. Basically, I'm saying add a factory-like interface
that looks something like the following:

public interface UniverateNumericalDeriver {
 public UnivariateRealFunction derive(UnivariateRealFunction f, int derivOrder);
}

For an initial implementation of this interface, I propose using
finite differences - either central, forward, or backward. Computing
the finite difference coefficients, for any derivative order and any
error order, is a relatively trivial linear algebra problem. The user
will simply choose an error order and difference type when setting up
an FD univariate deriver - everything else will happen automagically.
You can compute the FD coefficients once the user invokes the function
in the interface above (might be expensive), and determine an
appropriate stencil width when they call evaluate(double) on the
function returned by the aformentioned method - for example, if the
user has asked for the nth derivative, we simply use the nth root of
the machine epsilon/double ulp for the stencil width. It would also be
pretty easy to let the user control this (which might be desirable in
some cases). Wikipedia has decent article on FDs of all flavors:
http://en.wikipedia.org/wiki/Finite_difference

There are, of course, many other univariate numerical derivative
schemes that could be added in the future - using Fourier transforms,
Barak's adaptive degree polynomial method, etc. These could be added
later. We could also add the ability to numerically differentiate at
single point using an arbitrary or user-defined grid (rather than an
automatically generated one, like above). Barak's method and Fornberg
finite difference coefficients could be used in this case:
http://pubs.acs.org/doi/abs/10.1021/ac00113a006
http://amath.colorado.edu/faculty/fornberg/Docs/MathComp_88_FD_formulas.pdf

It would also make sense to add vectorial and matrix-flavored versions
of interface above. These interfaces would be slightly more complex,
but nothing too crazy. Again, the initial implementation would be
finite differences. This would also be really easy to implement, since
multivariate FD coefficients are nothing more than an outer product of
their univariate cousins. The Wikipedia article also has some good
introductory material on multivariate FDs.

Cheers,
Fran.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org