Re: [math] Least Squares Outlier Rejection
Hello. On Tue, 16 Sep 2014 18:34:52 -0400, Evan and Maureen Ward wrote: Hi Gilles, Luc, Thanks for all the comments. I'll try to respond to the more fundamental concerns in this email, and the more practical ones in another email, if we decide that we want to include data editing in [math]. [...] I don't see the that the editing has to occur during the optimization. It could be independent: 1. Let the optimization run to completion with all the data 2. Compute the residuals 3a. If there are no outliers, stop 3b. If there are outliers, remove them (or modify their weight) 4. Run the optimization with the remaining data 5. Goto 2 The advantage I see here is that the weight modification is a user decision (and implementation). However, IIUC the examples from the link provided by Evan, robust optimization (i.e. handling outliers during optimization) could lead to a better solution. Would it always be better? Not sure: significant points could be mistaken as outliers and be discarded before the optimizer could figure out the correct solution... To avoid a good but wrong fit, I'm afraid that we'd have to introduce several parameters (like do not discard outliers before iteration n, do not discard more than m points, etc.) The tuning of those parameters will probably not be obvious, and they will surely complicate the code (e.g. input data like the target and weight array won't be final). The advantage I see is not correctness (since the algorithm you outline will converge correctly), but reducing function evaluations. (I don't have data to back up this assertion.) Without inline data editing the optimization algorithm would waste the evaluations between when the outliers became obvious, and when the optimization converges. With the inline scheme, the outliers are deleted as soon as possible, and the remaining evaluations are used to converge towards the correct solution. Converging to a good but wrong will always be a risk of any algorithm that automatically throws away data. As with our other algorithms, I'm expecting the user to know when the algorithm is a good fit for their problem. The use case I see is when the observations contain mostly real data, and a few random numbers. The bad observations can be hard to identify apriori, but become obvious during the fitting process. [...] What works in one domain might not in another. Thus, the feature should not alter the standardness, nor decrease the robustness of the implementation. Caring for special cases (for which the feature is useful) may be obtained by e.g. use the standard algorithm as a basic block that is called repeatedly, as hinted above (and tuning the standard parameters and input data appropriately for each call). I was not expecting the response that [math] may not want this feature. I'm o.k. with this result since I can implement it as an addition to [math], though the API won't be as clean. IMHO, we cannot assume that fiddling with some of the data points while the optimization progresses won't alter the correctness of the solution. I think that when points are deemed outliers, e.g. using external knowledge not available to the optimizer, there should be removed; and the optimization be redone on the real data. As I understand it, robust optimization is not really a good name (I'd suggest fuzzy, or something) because it will indeed assign less weight to data points solely on the basis that they are less represented among the currently available data, irrespective of whether they actually pertain to the phenomenon being measured. At first sight, this could allow the optimizer to drag farther and farther away from the correct solution (if those points were no outliers). At first sight, I'd avoid modification of the sizes of input data (option 2); from an API usage viewpoint, I imagine that user code will require additional length tests. Couldn't the problem you mention in option 1 disappear by having different methods that return the a-priori weights and the modified weights? As we are already much too stringent with our compatibility policy, I would allow the case were only *very* advanced users would have problems. So it would seem fair to me if we can do some changes were the users of the factory are unaffected and expert users who decided the factory was not good for them have to put new efforts. Before embarking on this, I would like to see examples where the inline outlier rejection is leads to a (correct) solution impossible to achieve with the approach I've suggested. As discussed above, I don't see any cases where inline outlier rejection will result in a less correct solution than the algorithm you outline. I do see a potential case, in my current work. ;-) I do think we can save a significant number of function evaluations by using inline outlier rejection. We can of course talk about a performance-correctness trade-off, perhaps useful
Re: [math] Least Squares Outlier Rejection
Hi Gilles, Luc, Thanks for all the comments. I'll try to respond to the more fundamental concerns in this email, and the more practical ones in another email, if we decide that we want to include data editing in [math]. On Fri, Sep 12, 2014 at 8:55 AM, Gilles gil...@harfang.homelinux.org wrote: Hi. On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote: Le 12/09/2014 01:35, Gilles a écrit : Hello. On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote: Hi, A while ago I had bought up the idea of adding residual editing (aka data editing, outlier rejection, robust regression) to our non-linear least squares implementations.[1] As the name suggests, the idea is to de-weight observations that don't match the user's model. There are several ways to this including choosing a fixed cutoff, a fixed standard deviation cutoff, or reducing a residual's weight based on its magnitude.[2] However we add the data editing feature I think it will cause backward incompatibilities with the released API. I've outlined below the two options I see. I'm open to other ideas as well. 1. Replace edited residuals with 0's in the residual vector and Jacobian (i.e. apply a 0 weight). This has the advantage of being simple to implement and that our existing optimizers are already able to handle it. The downside is evident when the user tries to obtain the number of residuals that were edited. It is hard to tell the difference between an edited residual, an apriori zero weight, or a model evaluation where the residual and gradient is, in fact, zero. We can provide easy access to the number of edited residuals by adding a method to the Evaluation interface. (This is what I implemented in the patch in the original thread.) Now that the code has been released though, this would cause a backward incompatibility for some advanced users. Most users will likely use the included factory and builder methods to define their LeastSquaresProblem(LSP) and these users would not be affected by the change. Only the users that provide a custom implementation of LSP.Evauation would be affected. 2. Remove edited residuals from the gradient and Jacobian, so that the resulting vector and matrix have fewer rows. The advantage here is that the user can compare the length of the residual vector in the Optimum to the number of observations in the LSP to determine the number of edited residuals. The problem is that returning vectors/matrices with different sizes from LSP.evaluate() would violate the contract. Additionally we would have to modify our existing optimizers to deal with the variable lengths. For GaussNewton the modification would be small, but for LevenburgMarquardt I would likely have to re-write it since I don't understand the code (not for lack of trying :P ). Users that implement LeastSquaresOptimizers would likely have to modify their code as well. To summarize, in both cases users that only use the provided [math] classes would not have to modify their code, while users that provide custom implementations of [math] interfaces would have to modify their code. I would like to get this feature wrapped in for the next release. Please let me know if you have a preference for either implementation and if there are any other issues I should consider. Compatibility breaks cannot occur in minor releases. The next major release should not occur before deprecated classes are all replaced. [I'm thinking about the optimizers, for which the fluent API should be implemented based on your design of NLLS.] I theoretically agree here, but we are in a sort of chicken and egg problem. Typically, when I proposed to switch the ODE to the new fluent API, we decided we should do it at a major release only. We also said we cannot remove deprecation at minor releases. Now we say major release should wait for deprecation to be removed. So As far as I understand, everything should occur at once for major releases : remove deprecated classes, introduce new classes and interfaces, but see next remark. No, there is actually no chicken and egg problem in this case. The issue is only that we should not delete an API before a replacement is provided. So 1. We want to remove the deprecated optimizers API - must be done in major release (e.g. 4.0) 2. We want to let users upgrade their code - the new API must exist before 4.0 1 and 2 are not contradictory as the new API will be written from scratch (it is not a modification, say, of existing interfaces). It would be nice to recode the whole LevenbergMarquardtOptimizer in full OO Java. But it should be implemented and tested before any new feature is added to the mix. Here comes another problem: we want to have time to stabilize new API, which mean we should have a few minor releases with trial API. Unfortunately, once these trial API have been introduced, we are stuck with them and we end up piling
Re: [math] Least Squares Outlier Rejection
Le 12/09/2014 01:35, Gilles a écrit : Hello. On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote: Hi, A while ago I had bought up the idea of adding residual editing (aka data editing, outlier rejection, robust regression) to our non-linear least squares implementations.[1] As the name suggests, the idea is to de-weight observations that don't match the user's model. There are several ways to this including choosing a fixed cutoff, a fixed standard deviation cutoff, or reducing a residual's weight based on its magnitude.[2] However we add the data editing feature I think it will cause backward incompatibilities with the released API. I've outlined below the two options I see. I'm open to other ideas as well. 1. Replace edited residuals with 0's in the residual vector and Jacobian (i.e. apply a 0 weight). This has the advantage of being simple to implement and that our existing optimizers are already able to handle it. The downside is evident when the user tries to obtain the number of residuals that were edited. It is hard to tell the difference between an edited residual, an apriori zero weight, or a model evaluation where the residual and gradient is, in fact, zero. We can provide easy access to the number of edited residuals by adding a method to the Evaluation interface. (This is what I implemented in the patch in the original thread.) Now that the code has been released though, this would cause a backward incompatibility for some advanced users. Most users will likely use the included factory and builder methods to define their LeastSquaresProblem(LSP) and these users would not be affected by the change. Only the users that provide a custom implementation of LSP.Evauation would be affected. 2. Remove edited residuals from the gradient and Jacobian, so that the resulting vector and matrix have fewer rows. The advantage here is that the user can compare the length of the residual vector in the Optimum to the number of observations in the LSP to determine the number of edited residuals. The problem is that returning vectors/matrices with different sizes from LSP.evaluate() would violate the contract. Additionally we would have to modify our existing optimizers to deal with the variable lengths. For GaussNewton the modification would be small, but for LevenburgMarquardt I would likely have to re-write it since I don't understand the code (not for lack of trying :P ). Users that implement LeastSquaresOptimizers would likely have to modify their code as well. To summarize, in both cases users that only use the provided [math] classes would not have to modify their code, while users that provide custom implementations of [math] interfaces would have to modify their code. I would like to get this feature wrapped in for the next release. Please let me know if you have a preference for either implementation and if there are any other issues I should consider. Compatibility breaks cannot occur in minor releases. The next major release should not occur before deprecated classes are all replaced. [I'm thinking about the optimizers, for which the fluent API should be implemented based on your design of NLLS.] I theoretically agree here, but we are in a sort of chicken and egg problem. Typically, when I proposed to switch the ODE to the new fluent API, we decided we should do it at a major release only. We also said we cannot remove deprecation at minor releases. Now we say major release should wait for deprecation to be removed. So As far as I understand, everything should occur at once for major releases : remove deprecated classes, introduce new classes and interfaces, but see next remark. It would be nice to recode the whole LevenbergMarquardtOptimizer in full OO Java. But it should be implemented and tested before any new feature is added to the mix. Here comes another problem: we want to have time to stabilize new API, which mean we should have a few minor releases with trial API. Unfortunately, once these trial API have been introduced, we are stuck with them and we end up piling several API togethers (the optimizers are the perfect example). So when do we test new API : at minor releases just before a major release, but then we keep all the trials until a big clean up at major release ? Do I understand correctly that in the robust fit, the weights are modified during the optimization? If so, would the algorithms still be standard? I think performing some modifications between iterations is still a standard way to perform some optimizations: we can detect the outliers only once we have computed residuals and some statistics like the standard deviation, so this editing operation should occur somewhere during the optimization process. I have already seen this typically in orbit determination so I would consider it a classical optional feature. At first sight, I'd avoid modification of the sizes of input data (option
Re: [math] Least Squares Outlier Rejection
Hi. On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote: Le 12/09/2014 01:35, Gilles a écrit : Hello. On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote: Hi, A while ago I had bought up the idea of adding residual editing (aka data editing, outlier rejection, robust regression) to our non-linear least squares implementations.[1] As the name suggests, the idea is to de-weight observations that don't match the user's model. There are several ways to this including choosing a fixed cutoff, a fixed standard deviation cutoff, or reducing a residual's weight based on its magnitude.[2] However we add the data editing feature I think it will cause backward incompatibilities with the released API. I've outlined below the two options I see. I'm open to other ideas as well. 1. Replace edited residuals with 0's in the residual vector and Jacobian (i.e. apply a 0 weight). This has the advantage of being simple to implement and that our existing optimizers are already able to handle it. The downside is evident when the user tries to obtain the number of residuals that were edited. It is hard to tell the difference between an edited residual, an apriori zero weight, or a model evaluation where the residual and gradient is, in fact, zero. We can provide easy access to the number of edited residuals by adding a method to the Evaluation interface. (This is what I implemented in the patch in the original thread.) Now that the code has been released though, this would cause a backward incompatibility for some advanced users. Most users will likely use the included factory and builder methods to define their LeastSquaresProblem(LSP) and these users would not be affected by the change. Only the users that provide a custom implementation of LSP.Evauation would be affected. 2. Remove edited residuals from the gradient and Jacobian, so that the resulting vector and matrix have fewer rows. The advantage here is that the user can compare the length of the residual vector in the Optimum to the number of observations in the LSP to determine the number of edited residuals. The problem is that returning vectors/matrices with different sizes from LSP.evaluate() would violate the contract. Additionally we would have to modify our existing optimizers to deal with the variable lengths. For GaussNewton the modification would be small, but for LevenburgMarquardt I would likely have to re-write it since I don't understand the code (not for lack of trying :P ). Users that implement LeastSquaresOptimizers would likely have to modify their code as well. To summarize, in both cases users that only use the provided [math] classes would not have to modify their code, while users that provide custom implementations of [math] interfaces would have to modify their code. I would like to get this feature wrapped in for the next release. Please let me know if you have a preference for either implementation and if there are any other issues I should consider. Compatibility breaks cannot occur in minor releases. The next major release should not occur before deprecated classes are all replaced. [I'm thinking about the optimizers, for which the fluent API should be implemented based on your design of NLLS.] I theoretically agree here, but we are in a sort of chicken and egg problem. Typically, when I proposed to switch the ODE to the new fluent API, we decided we should do it at a major release only. We also said we cannot remove deprecation at minor releases. Now we say major release should wait for deprecation to be removed. So As far as I understand, everything should occur at once for major releases : remove deprecated classes, introduce new classes and interfaces, but see next remark. No, there is actually no chicken and egg problem in this case. The issue is only that we should not delete an API before a replacement is provided. So 1. We want to remove the deprecated optimizers API - must be done in major release (e.g. 4.0) 2. We want to let users upgrade their code - the new API must exist before 4.0 1 and 2 are not contradictory as the new API will be written from scratch (it is not a modification, say, of existing interfaces). It would be nice to recode the whole LevenbergMarquardtOptimizer in full OO Java. But it should be implemented and tested before any new feature is added to the mix. Here comes another problem: we want to have time to stabilize new API, which mean we should have a few minor releases with trial API. Unfortunately, once these trial API have been introduced, we are stuck with them and we end up piling several API togethers (the optimizers are the perfect example). Piling up, yes, but it is not a problem: all the old APIs are already deprecated; only the replacements are missing. We must provide them in a minor release before 4.0. [This should not need to delay the next release (3.4), as we can have as many 3.x as we want...] So when
[math] Least Squares Outlier Rejection
Hi, A while ago I had bought up the idea of adding residual editing (aka data editing, outlier rejection, robust regression) to our non-linear least squares implementations.[1] As the name suggests, the idea is to de-weight observations that don't match the user's model. There are several ways to this including choosing a fixed cutoff, a fixed standard deviation cutoff, or reducing a residual's weight based on its magnitude.[2] However we add the data editing feature I think it will cause backward incompatibilities with the released API. I've outlined below the two options I see. I'm open to other ideas as well. 1. Replace edited residuals with 0's in the residual vector and Jacobian (i.e. apply a 0 weight). This has the advantage of being simple to implement and that our existing optimizers are already able to handle it. The downside is evident when the user tries to obtain the number of residuals that were edited. It is hard to tell the difference between an edited residual, an apriori zero weight, or a model evaluation where the residual and gradient is, in fact, zero. We can provide easy access to the number of edited residuals by adding a method to the Evaluation interface. (This is what I implemented in the patch in the original thread.) Now that the code has been released though, this would cause a backward incompatibility for some advanced users. Most users will likely use the included factory and builder methods to define their LeastSquaresProblem(LSP) and these users would not be affected by the change. Only the users that provide a custom implementation of LSP.Evauation would be affected. 2. Remove edited residuals from the gradient and Jacobian, so that the resulting vector and matrix have fewer rows. The advantage here is that the user can compare the length of the residual vector in the Optimum to the number of observations in the LSP to determine the number of edited residuals. The problem is that returning vectors/matrices with different sizes from LSP.evaluate() would violate the contract. Additionally we would have to modify our existing optimizers to deal with the variable lengths. For GaussNewton the modification would be small, but for LevenburgMarquardt I would likely have to re-write it since I don't understand the code (not for lack of trying :P ). Users that implement LeastSquaresOptimizers would likely have to modify their code as well. To summarize, in both cases users that only use the provided [math] classes would not have to modify their code, while users that provide custom implementations of [math] interfaces would have to modify their code. I would like to get this feature wrapped in for the next release. Please let me know if you have a preference for either implementation and if there are any other issues I should consider. Best Regards, Evan [1] http://markmail.org/message/e53nago3swvu3t52 https://issues.apache.org/jira/browse/MATH-1105 [2] http://www.mathworks.com/help/curvefit/removing-outliers.html http://www.mathworks.com/help/curvefit/least-squares-fitting.html
Re: [math] Least Squares Outlier Rejection
Hello. On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote: Hi, A while ago I had bought up the idea of adding residual editing (aka data editing, outlier rejection, robust regression) to our non-linear least squares implementations.[1] As the name suggests, the idea is to de-weight observations that don't match the user's model. There are several ways to this including choosing a fixed cutoff, a fixed standard deviation cutoff, or reducing a residual's weight based on its magnitude.[2] However we add the data editing feature I think it will cause backward incompatibilities with the released API. I've outlined below the two options I see. I'm open to other ideas as well. 1. Replace edited residuals with 0's in the residual vector and Jacobian (i.e. apply a 0 weight). This has the advantage of being simple to implement and that our existing optimizers are already able to handle it. The downside is evident when the user tries to obtain the number of residuals that were edited. It is hard to tell the difference between an edited residual, an apriori zero weight, or a model evaluation where the residual and gradient is, in fact, zero. We can provide easy access to the number of edited residuals by adding a method to the Evaluation interface. (This is what I implemented in the patch in the original thread.) Now that the code has been released though, this would cause a backward incompatibility for some advanced users. Most users will likely use the included factory and builder methods to define their LeastSquaresProblem(LSP) and these users would not be affected by the change. Only the users that provide a custom implementation of LSP.Evauation would be affected. 2. Remove edited residuals from the gradient and Jacobian, so that the resulting vector and matrix have fewer rows. The advantage here is that the user can compare the length of the residual vector in the Optimum to the number of observations in the LSP to determine the number of edited residuals. The problem is that returning vectors/matrices with different sizes from LSP.evaluate() would violate the contract. Additionally we would have to modify our existing optimizers to deal with the variable lengths. For GaussNewton the modification would be small, but for LevenburgMarquardt I would likely have to re-write it since I don't understand the code (not for lack of trying :P ). Users that implement LeastSquaresOptimizers would likely have to modify their code as well. To summarize, in both cases users that only use the provided [math] classes would not have to modify their code, while users that provide custom implementations of [math] interfaces would have to modify their code. I would like to get this feature wrapped in for the next release. Please let me know if you have a preference for either implementation and if there are any other issues I should consider. Compatibility breaks cannot occur in minor releases. The next major release should not occur before deprecated classes are all replaced. [I'm thinking about the optimizers, for which the fluent API should be implemented based on your design of NLLS.] It would be nice to recode the whole LevenbergMarquardtOptimizer in full OO Java. But it should be implemented and tested before any new feature is added to the mix. Do I understand correctly that in the robust fit, the weights are modified during the optimization? If so, would the algorithms still be standard? At first sight, I'd avoid modification of the sizes of input data (option 2); from an API usage viewpoint, I imagine that user code will require additional length tests. Couldn't the problem you mention in option 1 disappear by having different methods that return the a-priori weights and the modified weights? Best regards, Gilles Best Regards, Evan [1] http://markmail.org/message/e53nago3swvu3t52 https://issues.apache.org/jira/browse/MATH-1105 [2] http://www.mathworks.com/help/curvefit/removing-outliers.html http://www.mathworks.com/help/curvefit/least-squares-fitting.html - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org