Re: [math] Least Squares Outlier Rejection

2014-09-17 Thread Gilles

Hello.

On Tue, 16 Sep 2014 18:34:52 -0400, Evan and Maureen Ward wrote:

Hi Gilles, Luc,

Thanks for all the comments. I'll try to respond to the more 
fundamental
concerns in this email, and the more practical ones in another email, 
if we

decide that we want to include data editing in [math].




[...]




I don't see the that the editing has to occur during the 
optimization.

It could be independent:
 1. Let the optimization run to completion with all the data
 2. Compute the residuals
 3a. If there are no outliers, stop
 3b. If there are outliers, remove them (or modify their weight)
 4. Run the optimization with the remaining data
 5. Goto 2

The advantage I see here is that the weight modification is a user
decision (and implementation).

However, IIUC the examples from the link provided by Evan, robust
optimization (i.e. handling outliers during optimization) could lead
to a better solution.

Would it always be better? Not sure: significant points could be
mistaken as outliers and be discarded before the optimizer could
figure out the correct solution...
To avoid a good but wrong fit, I'm afraid that we'd have to
introduce several parameters (like do not discard outliers before
iteration n, do not discard more than m points, etc.)
The tuning of those parameters will probably not be obvious, and
they will surely complicate the code (e.g. input data like the
target and weight array won't be final).



The advantage I see is not correctness (since the algorithm you 
outline
will converge correctly), but reducing function evaluations. (I don't 
have

data to back up this assertion.) Without inline data editing the
optimization algorithm would waste the evaluations between when the
outliers became obvious, and when the optimization converges. With 
the
inline scheme, the outliers are deleted as soon as possible, and 
the
remaining evaluations are used to converge towards the correct 
solution.


Converging to a good but wrong will always be a risk of any 
algorithm
that automatically throws away data. As with our other algorithms, 
I'm

expecting the user to know when the algorithm is a good fit for their
problem. The use case I see is when the observations contain mostly 
real

data, and a few random numbers. The bad observations can be hard to
identify apriori, but become obvious during the fitting process.



[...]


What works in one domain might not in another.
Thus, the feature should not alter the standardness, nor decrease
the robustness of the implementation. Caring for special cases
(for which the feature is useful) may be obtained by e.g. use the
standard algorithm as a basic block that is called repeatedly, as
hinted above (and tuning the standard parameters and input data
appropriately for each call).



I was not expecting the response that [math] may not want this 
feature. I'm
o.k. with this result since I can implement it as an addition to 
[math],

though the API won't be as clean.



IMHO, we cannot assume that fiddling with some of the data points while
the optimization progresses won't alter the correctness of the 
solution.


I think that when points are deemed outliers, e.g. using external
knowledge not available to the optimizer, there should be removed; and
the optimization be redone on the real data.
As I understand it, robust optimization is not really a good name 
(I'd
suggest fuzzy, or something) because it will indeed assign less 
weight

to data points solely on the basis that they are less represented among
the currently available data, irrespective of whether they actually
pertain to the phenomenon being measured.
At first sight, this could allow the optimizer to drag farther and 
farther

away from the correct solution (if those points were no outliers).






At first sight, I'd avoid modification of the sizes of input data
(option 2);
from an API usage viewpoint, I imagine that user code will require
additional
length tests.

Couldn't the problem you mention in option 1 disappear by having
different
methods that return the a-priori weights and the modified weights?



As we are already much too stringent with our compatibility policy, 
I

would allow the case were only *very* advanced users would have
problems. So it would seem fair to me if we can do some changes 
were the
users of the factory are unaffected and expert users who decided 
the

factory was not good for them have to put new efforts.



Before embarking on this, I would like to see examples where the 
inline
outlier rejection is leads to a (correct) solution impossible to 
achieve

with the approach I've suggested.



As discussed above, I don't see any cases where inline outlier 
rejection
will result in a less correct solution than the algorithm you 
outline.


I do see a potential case, in my current work. ;-)


I do
think we can save a significant number of function evaluations by 
using

inline outlier rejection.


We can of course talk about a performance-correctness trade-off, 
perhaps
useful 

Re: [math] Least Squares Outlier Rejection

2014-09-16 Thread Evan and Maureen Ward
Hi Gilles, Luc,

Thanks for all the comments. I'll try to respond to the more fundamental
concerns in this email, and the more practical ones in another email, if we
decide that we want to include data editing in [math].

On Fri, Sep 12, 2014 at 8:55 AM, Gilles gil...@harfang.homelinux.org
wrote:

 Hi.


 On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote:

 Le 12/09/2014 01:35, Gilles a écrit :

 Hello.

 On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:

 Hi,

 A while ago I had bought up the idea of adding residual editing (aka
 data
 editing, outlier rejection, robust regression) to our non-linear least
 squares implementations.[1] As the name suggests, the idea is to
 de-weight
 observations that don't match the user's model. There are several ways
 to
 this including choosing a fixed cutoff, a fixed standard deviation
 cutoff,
 or reducing a residual's weight based on its magnitude.[2]

 However we add the data editing feature I think it will cause backward
 incompatibilities with the released API. I've outlined below the two
 options I see. I'm open to other ideas as well.

 1. Replace edited residuals with 0's in the residual vector and Jacobian
 (i.e. apply a 0 weight). This has the advantage of being simple to
 implement and that our existing optimizers are already able to handle
 it.
 The downside is evident when the user tries to obtain the number of
 residuals that were edited. It is hard to tell the difference between an
 edited residual, an apriori zero weight, or a model evaluation where the
 residual and gradient is, in fact, zero. We can provide easy access to
 the
 number of edited residuals by adding a method to the Evaluation
 interface.
 (This is what I implemented in the patch in the original thread.) Now
 that
 the code has been released though, this would cause a backward
 incompatibility for some advanced users. Most users will likely use the
 included factory and builder methods to define their
 LeastSquaresProblem(LSP) and these users would not be affected by the
 change. Only the users that provide a custom implementation of
 LSP.Evauation would be affected.

 2. Remove edited residuals from the gradient and Jacobian, so that the
 resulting vector and matrix have fewer rows. The advantage here is
 that the
 user can compare the length of the residual vector in the Optimum to the
 number of observations in the LSP to determine the number of edited
 residuals. The problem is that returning vectors/matrices with different
 sizes from LSP.evaluate() would violate the contract. Additionally we
 would
 have to modify our existing optimizers to deal with the variable
 lengths.
 For GaussNewton the modification would be small, but for
 LevenburgMarquardt
 I would likely have to re-write it since I don't understand the code
 (not
 for lack of trying :P ). Users that implement LeastSquaresOptimizers
 would
 likely have to modify their code as well.

 To summarize, in both cases users that only use the provided [math]
 classes
 would not have to modify their code, while users that provide custom
 implementations of [math] interfaces would have to modify their code.

 I would like to get this feature wrapped in for the next release. Please
 let me know if you have a preference for either implementation and if
 there
 are any other issues I should consider.


 Compatibility breaks cannot occur in minor releases.
 The next major release should not occur before deprecated classes are all
 replaced. [I'm thinking about the optimizers, for which the fluent API
 should
 be implemented based on your design of NLLS.]


 I theoretically agree here, but we are in a sort of chicken and egg
 problem. Typically, when I proposed to switch the ODE to the new fluent
 API, we decided we should do it at a major release only. We also said we
 cannot remove deprecation at minor releases. Now we say major release
 should wait for deprecation to be removed.

 So As far as I understand, everything should occur at once for major
 releases : remove deprecated classes, introduce new classes and
 interfaces, but see next remark.


 No, there is actually no chicken and egg problem in this case.
 The issue is only that we should not delete an API before a replacement
 is provided.
 So
 1. We want to remove the deprecated optimizers API
- must be done in major release (e.g. 4.0)
 2. We want to let users upgrade their code
- the new API must exist before 4.0

 1 and 2 are not contradictory as the new API will be written from
 scratch (it is not a modification, say, of existing interfaces).


 It would be nice to recode the whole LevenbergMarquardtOptimizer in
 full OO
 Java. But it should be implemented and tested before any new feature is
 added
 to the mix.


 Here comes another problem: we want to have time to stabilize new API,
 which mean we should have a few minor releases with trial API.
 Unfortunately, once these trial API have been introduced, we are stuck
 with them and we end up piling 

Re: [math] Least Squares Outlier Rejection

2014-09-12 Thread Luc Maisonobe
Le 12/09/2014 01:35, Gilles a écrit :
 Hello.
 
 On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:
 Hi,

 A while ago I had bought up the idea of adding residual editing (aka data
 editing, outlier rejection, robust regression) to our non-linear least
 squares implementations.[1] As the name suggests, the idea is to
 de-weight
 observations that don't match the user's model. There are several ways to
 this including choosing a fixed cutoff, a fixed standard deviation
 cutoff,
 or reducing a residual's weight based on its magnitude.[2]

 However we add the data editing feature I think it will cause backward
 incompatibilities with the released API. I've outlined below the two
 options I see. I'm open to other ideas as well.

 1. Replace edited residuals with 0's in the residual vector and Jacobian
 (i.e. apply a 0 weight). This has the advantage of being simple to
 implement and that our existing optimizers are already able to handle it.
 The downside is evident when the user tries to obtain the number of
 residuals that were edited. It is hard to tell the difference between an
 edited residual, an apriori zero weight, or a model evaluation where the
 residual and gradient is, in fact, zero. We can provide easy access to
 the
 number of edited residuals by adding a method to the Evaluation
 interface.
 (This is what I implemented in the patch in the original thread.) Now
 that
 the code has been released though, this would cause a backward
 incompatibility for some advanced users. Most users will likely use the
 included factory and builder methods to define their
 LeastSquaresProblem(LSP) and these users would not be affected by the
 change. Only the users that provide a custom implementation of
 LSP.Evauation would be affected.

 2. Remove edited residuals from the gradient and Jacobian, so that the
 resulting vector and matrix have fewer rows. The advantage here is
 that the
 user can compare the length of the residual vector in the Optimum to the
 number of observations in the LSP to determine the number of edited
 residuals. The problem is that returning vectors/matrices with different
 sizes from LSP.evaluate() would violate the contract. Additionally we
 would
 have to modify our existing optimizers to deal with the variable lengths.
 For GaussNewton the modification would be small, but for
 LevenburgMarquardt
 I would likely have to re-write it since I don't understand the code (not
 for lack of trying :P ). Users that implement LeastSquaresOptimizers
 would
 likely have to modify their code as well.

 To summarize, in both cases users that only use the provided [math]
 classes
 would not have to modify their code, while users that provide custom
 implementations of [math] interfaces would have to modify their code.

 I would like to get this feature wrapped in for the next release. Please
 let me know if you have a preference for either implementation and if
 there
 are any other issues I should consider.
 
 Compatibility breaks cannot occur in minor releases.
 The next major release should not occur before deprecated classes are all
 replaced. [I'm thinking about the optimizers, for which the fluent API
 should
 be implemented based on your design of NLLS.]

I theoretically agree here, but we are in a sort of chicken and egg
problem. Typically, when I proposed to switch the ODE to the new fluent
API, we decided we should do it at a major release only. We also said we
cannot remove deprecation at minor releases. Now we say major release
should wait for deprecation to be removed.

So As far as I understand, everything should occur at once for major
releases : remove deprecated classes, introduce new classes and
interfaces, but see next remark.

 
 It would be nice to recode the whole LevenbergMarquardtOptimizer in
 full OO
 Java. But it should be implemented and tested before any new feature is
 added
 to the mix.

Here comes another problem: we want to have time to stabilize new API,
which mean we should have a few minor releases with trial API.
Unfortunately, once these trial API have been introduced, we are stuck
with them and we end up piling several API togethers (the optimizers are
the perfect example).

So when do we test new API : at minor releases just before a major
release, but then we keep all the trials until a big clean up at major
release ?

 
 Do I understand correctly that in the robust fit, the weights are
 modified
 during the optimization?
 If so, would the algorithms still be standard?

I think performing some modifications between iterations is still a
standard way to perform some optimizations: we can detect the outliers
only once we have computed residuals and some statistics like the
standard deviation, so this editing operation should occur somewhere
during the optimization process. I have already seen this typically in
orbit determination so I would consider it a classical optional feature.

 
 At first sight, I'd avoid modification of the sizes of input data
 (option 

Re: [math] Least Squares Outlier Rejection

2014-09-12 Thread Gilles

Hi.

On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote:

Le 12/09/2014 01:35, Gilles a écrit :

Hello.

On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:

Hi,

A while ago I had bought up the idea of adding residual editing 
(aka data
editing, outlier rejection, robust regression) to our non-linear 
least

squares implementations.[1] As the name suggests, the idea is to
de-weight
observations that don't match the user's model. There are several 
ways to

this including choosing a fixed cutoff, a fixed standard deviation
cutoff,
or reducing a residual's weight based on its magnitude.[2]

However we add the data editing feature I think it will cause 
backward
incompatibilities with the released API. I've outlined below the 
two

options I see. I'm open to other ideas as well.

1. Replace edited residuals with 0's in the residual vector and 
Jacobian

(i.e. apply a 0 weight). This has the advantage of being simple to
implement and that our existing optimizers are already able to 
handle it.

The downside is evident when the user tries to obtain the number of
residuals that were edited. It is hard to tell the difference 
between an
edited residual, an apriori zero weight, or a model evaluation 
where the
residual and gradient is, in fact, zero. We can provide easy access 
to

the
number of edited residuals by adding a method to the Evaluation
interface.
(This is what I implemented in the patch in the original thread.) 
Now

that
the code has been released though, this would cause a backward
incompatibility for some advanced users. Most users will likely use 
the

included factory and builder methods to define their
LeastSquaresProblem(LSP) and these users would not be affected by 
the

change. Only the users that provide a custom implementation of
LSP.Evauation would be affected.

2. Remove edited residuals from the gradient and Jacobian, so that 
the

resulting vector and matrix have fewer rows. The advantage here is
that the
user can compare the length of the residual vector in the Optimum 
to the

number of observations in the LSP to determine the number of edited
residuals. The problem is that returning vectors/matrices with 
different
sizes from LSP.evaluate() would violate the contract. Additionally 
we

would
have to modify our existing optimizers to deal with the variable 
lengths.

For GaussNewton the modification would be small, but for
LevenburgMarquardt
I would likely have to re-write it since I don't understand the 
code (not
for lack of trying :P ). Users that implement 
LeastSquaresOptimizers

would
likely have to modify their code as well.

To summarize, in both cases users that only use the provided [math]
classes
would not have to modify their code, while users that provide 
custom
implementations of [math] interfaces would have to modify their 
code.


I would like to get this feature wrapped in for the next release. 
Please
let me know if you have a preference for either implementation and 
if

there
are any other issues I should consider.


Compatibility breaks cannot occur in minor releases.
The next major release should not occur before deprecated classes 
are all
replaced. [I'm thinking about the optimizers, for which the fluent 
API

should
be implemented based on your design of NLLS.]


I theoretically agree here, but we are in a sort of chicken and egg
problem. Typically, when I proposed to switch the ODE to the new 
fluent
API, we decided we should do it at a major release only. We also said 
we

cannot remove deprecation at minor releases. Now we say major release
should wait for deprecation to be removed.

So As far as I understand, everything should occur at once for major
releases : remove deprecated classes, introduce new classes and
interfaces, but see next remark.


No, there is actually no chicken and egg problem in this case.
The issue is only that we should not delete an API before a replacement
is provided.
So
1. We want to remove the deprecated optimizers API
   - must be done in major release (e.g. 4.0)
2. We want to let users upgrade their code
   - the new API must exist before 4.0

1 and 2 are not contradictory as the new API will be written from
scratch (it is not a modification, say, of existing interfaces).



It would be nice to recode the whole LevenbergMarquardtOptimizer 
in

full OO
Java. But it should be implemented and tested before any new feature 
is

added
to the mix.


Here comes another problem: we want to have time to stabilize new 
API,

which mean we should have a few minor releases with trial API.
Unfortunately, once these trial API have been introduced, we are 
stuck
with them and we end up piling several API togethers (the optimizers 
are

the perfect example).


Piling up, yes, but it is not a problem: all the old APIs are already
deprecated; only the replacements are missing. We must provide them
in a minor release before 4.0. [This should not need to delay the next
release (3.4), as we can have as many 3.x as we want...]



So when 

[math] Least Squares Outlier Rejection

2014-09-11 Thread Evan Ward
Hi,

A while ago I had bought up the idea of adding residual editing (aka data
editing, outlier rejection, robust regression) to our non-linear least
squares implementations.[1] As the name suggests, the idea is to de-weight
observations that don't match the user's model. There are several ways to
this including choosing a fixed cutoff, a fixed standard deviation cutoff,
or reducing a residual's weight based on its magnitude.[2]

However we add the data editing feature I think it will cause backward
incompatibilities with the released API. I've outlined below the two
options I see. I'm open to other ideas as well.

1. Replace edited residuals with 0's in the residual vector and Jacobian
(i.e. apply a 0 weight). This has the advantage of being simple to
implement and that our existing optimizers are already able to handle it.
The downside is evident when the user tries to obtain the number of
residuals that were edited. It is hard to tell the difference between an
edited residual, an apriori zero weight, or a model evaluation where the
residual and gradient is, in fact, zero. We can provide easy access to the
number of edited residuals by adding a method to the Evaluation interface.
(This is what I implemented in the patch in the original thread.) Now that
the code has been released though, this would cause a backward
incompatibility for some advanced users. Most users will likely use the
included factory and builder methods to define their
LeastSquaresProblem(LSP) and these users would not be affected by the
change. Only the users that provide a custom implementation of
LSP.Evauation would be affected.

2. Remove edited residuals from the gradient and Jacobian, so that the
resulting vector and matrix have fewer rows. The advantage here is that the
user can compare the length of the residual vector in the Optimum to the
number of observations in the LSP to determine the number of edited
residuals. The problem is that returning vectors/matrices with different
sizes from LSP.evaluate() would violate the contract. Additionally we would
have to modify our existing optimizers to deal with the variable lengths.
For GaussNewton the modification would be small, but for LevenburgMarquardt
I would likely have to re-write it since I don't understand the code (not
for lack of trying :P ). Users that implement LeastSquaresOptimizers would
likely have to modify their code as well.

To summarize, in both cases users that only use the provided [math] classes
would not have to modify their code, while users that provide custom
implementations of [math] interfaces would have to modify their code.

I would like to get this feature wrapped in for the next release. Please
let me know if you have a preference for either implementation and if there
are any other issues I should consider.

Best Regards,
Evan

[1] http://markmail.org/message/e53nago3swvu3t52
 https://issues.apache.org/jira/browse/MATH-1105
[2] http://www.mathworks.com/help/curvefit/removing-outliers.html
 http://www.mathworks.com/help/curvefit/least-squares-fitting.html


Re: [math] Least Squares Outlier Rejection

2014-09-11 Thread Gilles

Hello.

On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:

Hi,

A while ago I had bought up the idea of adding residual editing (aka 
data
editing, outlier rejection, robust regression) to our non-linear 
least
squares implementations.[1] As the name suggests, the idea is to 
de-weight
observations that don't match the user's model. There are several 
ways to
this including choosing a fixed cutoff, a fixed standard deviation 
cutoff,

or reducing a residual's weight based on its magnitude.[2]

However we add the data editing feature I think it will cause 
backward

incompatibilities with the released API. I've outlined below the two
options I see. I'm open to other ideas as well.

1. Replace edited residuals with 0's in the residual vector and 
Jacobian

(i.e. apply a 0 weight). This has the advantage of being simple to
implement and that our existing optimizers are already able to handle 
it.

The downside is evident when the user tries to obtain the number of
residuals that were edited. It is hard to tell the difference between 
an
edited residual, an apriori zero weight, or a model evaluation where 
the
residual and gradient is, in fact, zero. We can provide easy access 
to the
number of edited residuals by adding a method to the Evaluation 
interface.
(This is what I implemented in the patch in the original thread.) Now 
that

the code has been released though, this would cause a backward
incompatibility for some advanced users. Most users will likely use 
the

included factory and builder methods to define their
LeastSquaresProblem(LSP) and these users would not be affected by the
change. Only the users that provide a custom implementation of
LSP.Evauation would be affected.

2. Remove edited residuals from the gradient and Jacobian, so that 
the
resulting vector and matrix have fewer rows. The advantage here is 
that the
user can compare the length of the residual vector in the Optimum to 
the

number of observations in the LSP to determine the number of edited
residuals. The problem is that returning vectors/matrices with 
different
sizes from LSP.evaluate() would violate the contract. Additionally we 
would
have to modify our existing optimizers to deal with the variable 
lengths.
For GaussNewton the modification would be small, but for 
LevenburgMarquardt
I would likely have to re-write it since I don't understand the code 
(not
for lack of trying :P ). Users that implement LeastSquaresOptimizers 
would

likely have to modify their code as well.

To summarize, in both cases users that only use the provided [math] 
classes

would not have to modify their code, while users that provide custom
implementations of [math] interfaces would have to modify their code.

I would like to get this feature wrapped in for the next release. 
Please
let me know if you have a preference for either implementation and if 
there

are any other issues I should consider.


Compatibility breaks cannot occur in minor releases.
The next major release should not occur before deprecated classes are 
all
replaced. [I'm thinking about the optimizers, for which the fluent API 
should

be implemented based on your design of NLLS.]

It would be nice to recode the whole LevenbergMarquardtOptimizer in 
full OO
Java. But it should be implemented and tested before any new feature is 
added

to the mix.

Do I understand correctly that in the robust fit, the weights are 
modified

during the optimization?
If so, would the algorithms still be standard?

At first sight, I'd avoid modification of the sizes of input data 
(option 2);
from an API usage viewpoint, I imagine that user code will require 
additional

length tests.

Couldn't the problem you mention in option 1 disappear by having 
different

methods that return the a-priori weights and the modified weights?


Best regards,
Gilles



Best Regards,
Evan

[1] http://markmail.org/message/e53nago3swvu3t52
 https://issues.apache.org/jira/browse/MATH-1105
[2] http://www.mathworks.com/help/curvefit/removing-outliers.html
 
http://www.mathworks.com/help/curvefit/least-squares-fitting.html



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org