Re: [math] use case for NaNStrategy.FIXED?

Gilles Sun, 29 Jun 2014 14:31:23 -0700

On Sun, 29 Jun 2014 10:25:58 -0700, Phil Steitz wrote:

On 6/29/14, 9:48 AM, venkatesha murthy wrote:
On Sun, Jun 29, 2014 at 1:25 AM, Phil Steitz <[email protected]>wrote:
On 6/25/14, 12:12 AM, Luc Maisonobe wrote:
Le 25/06/2014 06:05, venkatesha murthy a écrit :
On Wed, Jun 25, 2014 at 12:21 AM, Luc Maisonobe<[email protected]>
wrote:
Hi Venkat,

Le 23/06/2014 21:08, venkatesha murthy a écrit :
On Tue, Jun 24, 2014 at 12:08 AM, Luc Maisonobe<[email protected]>
wrote:
Hi all,
While looking further in Percentile class for MATH-1120, Ihave foundanother problem in the current implementation.NaNStrategy.FIXED
should
leave the NaNs in place, but at the end of theKthSelector.selectmethod, a call to Arrays.sort moves the NaNs to the end of thesmallsub-array. What is really implemented here is therefore closertoNaNStrategy.MAXIMAL than NaNStrategy.FIXED. This always occurin the
test cases because they use very short arrays, and we directly
switch to
this part of the select method.
Are NaNs considered higher than +Inf ?
If MAXIMAL represent replacing for +inf ; you need something to
indicate beyond this for NaN.
Well, we can just keep the NaN themselves and handled them
appropriately, hoping not to trash performances too much.
Agreed.
What is the test input you see an issue and what is the actualerror
you are seeing. Please share the test case.
Just look at PercentileTest.testReplaceNanInRange(). The firstcheck inthe test corresponds to a Percentile configuration at 50%percentile,and NaNStrategy.FIXED. The array has an odd number of entries,so the50% percentile is exactly one of the entries: the one at index 5in the
final array.
The initial ordering is { 0, 1, 2, 3, 4, NaN, NaN, 5, 7, NaN, 8}. Sofor the NaNStrategy.FIXED setting, it should not be modified atall inthe selection algorithm and the result for 50% should be thefirst NaNof the array, at index 5. In fact, due to the Arrays.sort, we*do*reorder the array into { 0, 1, 2, 3, 4, 5, 7, 8, NaN, NaN, NaN}, so
the result is 5.

Agreed. just verified by putting back the earlier insertionSort
function.
If we use NaNStrategy.MAXIMAL and any quantile above 67%, we getas a
result Double.POSITIVE_INFINITY instead of Double.NaN.

If we agree to leave FIXED as unchanged behaviour with your
insertionSort
code; then atleast MAXIMAL/MINIMAL should be allowed fortransformation
of
NaN to +/-Inf
I'm fine with it, as long as we clearly state it in theNaNStrategy
documentation, saying somethig like:
When MAXIMAL/MINIMAL are used, the NaNs are effecively *replaced*by+/- infinity, hence the results will be computed as if infinitieswere
 there in the first place.
-1 - this is mixing concerns.  NaNStrategy exists for one specific
purpose - to turn extend the ordering on doubles a total ordering.
Strategies prescribe only how NaNs are to be treated in the
ordering.  Side effects on the input array don't make sense in this
context.  The use case for which this class was created was rank
transformations. Returning infinities in rank transformationswould
blow up computations in these classes.  If a floating point
computations need to transform NaNs, the specific stats / other
classes that do this transformation should perform and document the
behavior.

Phil
OK. Agreed realized it later. Hence i have not touched NaNStrategyin mypatch(MATH-1132) . Now i have added a separate transformer totransform NaNsbut it uses NanStrategy. Please let me know on this as thistrasnformation
itself
can be used in different places.
Honestly, I don't see the value of this.  I would be more favorable
to an addition to MathArrays supporting NaN (or infinity) filtering
/ transformation.  Several years back we made the decision to just
let NaNs propagate in computations, which basically means that if
you feed NaNs to [math] you are going to get NaNs out in results.
If we do get into the business of filtering NaNs (or infinities for
that matter), I think it is probably best to do that in MathArrays
or somesuch - i.e., don't decorate APIs with NaN or
infinity-handling handling descriptions.  That would be a huge task
and would likely end up bleeding implementation detail into the
APIs.  As I said above, NaNStrategy has to exist for rank
transformations to be well-defined.  NaN-handling in floating point
computations is defined and as long as we fully document what our
algorithms do, I think it is fair enough to "let the NaNs fall where
they may" - which basically means users need to do the filtering /
transformation themselves.


So, do I understand correctly that it's OK to add the "remove" and
"replace" utility methods (cf. MATH-1130) in "MathArrays"?
Or does this thread conclude that we don't have a use for this within
Commons Math?

Gilles


Phil

[...]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [math] use case for NaNStrategy.FIXED?

Reply via email to