On 7/15/13 6:55 PM, sebb wrote: > At present the Frequency class seems to treat NaN as just another Comparable. > On my system it sorts as above POSITIVE_INFINITY. > > So getMode() can return NaN entries, as can the iterators. > > Is this reasonable behaviour? > Should it be documented and therefore tested? > > Or should NaN be disallowed from entering the frequency table? > > I suppose it would be easy enough to subclass Frequency and override > incrementValue(Comparable<?> v, long increment) to reject NaNs. > > But then we would need to document that addValue(Comparable<?> v) > calls it - or the subclass would need to override both to be sure.
My opinion is that Frequency should not treat Double.NaN specially. This is because Frequency *requires* a total ordering of whatever domain objects it works with and the semantics of all of its methods are defined in terms of the domain and the specified comparator. The (default) natural ordering (expressed by Double.compareTo) puts NaN at the top of the ordering (making it inconsistent with equals, but keeping it total). I don't think it is a good idea to try to put special code in to restrict the domain just for Doubles. It probably is a good idea to document and test the current behavior; though I suspect that (other than the case below) Double will rarely be used as the domain of a Frequency instance, as discrete distribution values are most often mapped onto integers. In StatUtils, on the other hand, I think it makes sense to drop NaNs from mode(double[] sample), since in this case we know the arguments are doubles and we know what NaN means (or more precisely, what it does not mean :) Phil > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
