[ https://issues.apache.org/jira/browse/MATH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178481#comment-13178481 ]
Christian Winter commented on MATH-692: --------------------------------------- I guess there is no alternative to this way of making probabilistic test cases pass. However, I understand your bad feeling with this kind of failure fixing. The problem is that probabilistic tests are quiet fuzzy: Neither a passed test nor a failed test provides a clear answer whether something is right or wrong in the implementation. There is just a high chance to pass such a test with a correct implementation. The chance for failure increases with an erroneous implementation due to systematic deviations in the generated data. These chances tell whether it is easy to find a seed which passes the tests or not. Thus difficulties in finding a suitable seed are an indicator for problems in the code. > Cumulative probability and inverse cumulative probability inconsistencies > ------------------------------------------------------------------------- > > Key: MATH-692 > URL: https://issues.apache.org/jira/browse/MATH-692 > Project: Commons Math > Issue Type: Bug > Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0 > Reporter: Christian Winter > Priority: Minor > Fix For: 3.0 > > Attachments: MATH-692_integerDomain_patch1.patch, > Math-692_realDomain_patch1.patch > > > There are some inconsistencies in the documentation and implementation of > functions regarding cumulative probabilities and inverse cumulative > probabilities. More precisely, '<' and '<=' are not used in a consistent way. > Besides I would move the function inverseCumulativeProbability(double) to the > interface Distribution. A true inverse of the distribution function does > neither exist for Distribution nor for ContinuosDistribution. Thus we need to > define the inverse in terms of quantiles anyway, and this can already be done > for Distribution. > On the whole I would declare the (inverse) cumulative probability functions > in the basic distribution interfaces as follows: > Distribution: > - cumulativeProbability(double x): returns P(X <= x) > - cumulativeProbability(double x0, double x1): returns P(x0 < X <= x1) [see > also 1)] > - inverseCumulativeProbability(double p): > returns the quantile function inf{x in R | P(X<=x) >= p} [see also 2), 3), > and 4)] > 1) An aternative definition could be P(x0 <= X <= x1). But this requires to > put the function probability(double x) or another cumulative probability > function into the interface Distribution in order be able to calculate P(x0 > <= X <= x1) in AbstractDistribution. > 2) This definition is stricter than the definition in ContinuousDistribution, > because the definition there does not specify what to do if there are > multiple x satisfying P(X<=x) = p. > 3) A modification could be defined for p=0: Returning sup{x in R | P(X<=x) = > 0} would yield the infimum of the distribution's support instead of a > mandatory -infinity. > 4) This affects issue MATH-540. I'd prefere the definition from above for the > following reasons: > - This definition simplifies inverse transform sampling (as mentioned in the > other issue). > - It is the standard textbook definition for the quantile function. > - For integer distributions it has the advantage that the result doesn't > change when switching to "x in Z", i.e. the result is independent of > considering the intergers as sole set or as part of the reals. > ContinuousDistribution: > nothing to be added regarding (inverse) cumulative probability functions > IntegerDistribution: > - cumulativeProbability(int x): returns P(X <= x) > - cumulativeProbability(int x0, int x1): returns P(x0 < X <= x1) [see also 1) > above] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira