Re: [math] Getting 1.0 out the door -- tasks remaining
Phil Steitz wrote: J.Pietschmann wrote: Phil Steitz wrote: 1) Decide what to do about inverse cumulative probabilities where p = 1 (easy solution is to document and throw) Nearly +1 My own nearly +1 on this just turned to -1. After looking some more at the code and thinking some more, I think that both p=1 and p=0 should be handled correctly in all cases. The difficult cases are when the probability density function has unbounded support. Here is what I propose for the values of inverseCumulativeProbability() at p=0 and p=1 for current distributions. Unless otherwise noted, these values are intented to be independent of distribution parameters. Distribution p=0 p=1 -- Binomial 0 Integer.MAX_VALUE Chisquare 0 Double.POSITIVE_INFINITY Exponential0 Double.POSITIVE_INFINITY F 0 Double.POSITIVE_INFINITY Gamma 0 Double.POSITIVE_INFINITY HyperGeometric 0 finite, parameter-dependent Normal Double.NEGATIVE_INFINITY Double.POSITIVE_INFINITY TDouble.NEGATIVE_INFINITY Double.POSITIVE_INFINITY Other than the value for Chisquare with p=1 (which causes R to hang), these values are consistent with what R returns using the q* functions. It might be more convenient to return Double.MAX_VALUE, -Double.MAX_VALUE in place of the INFINITY's (since then we could just use getDomainLowerBound at 0 and 1) but this would not be correct mathematically. If there are no objections, I will find a way to get the values above returned. I have committed changes and tests to ensure that the values in the table above are returned, modulo correcting the following mistakes: Both of the discrete distributions (Binomial and Hypergeometric) should return -1 for the inverseCumulativeProbability(0). The definition that we are using is that inverseCumulativeProbability(p) = the largest x such that P(X = x) = p. Since 0 has positive probability for both the Binomial and Hypergeometric distributions, and the function is integer-valued, the correct value to return in these cases is actually -1, not 0. Phil - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Getting 1.0 out the door -- tasks remaining
--- Phil Steitz [EMAIL PROTECTED] wrote: J.Pietschmann wrote: Phil Steitz wrote: 1) Decide what to do about inverse cumulative probabilities where p = 1 (easy solution is to document and throw) Nearly +1 My own nearly +1 on this just turned to -1. After looking some more at the code and thinking some more, I think that both p=1 and p=0 should be handled correctly in all cases. The difficult cases are when the probability density function has unbounded support. Here is what I propose for the values of inverseCumulativeProbability() at p=0 and p=1 for current distributions. Unless otherwise noted, these values are intented to be independent of distribution parameters. Distribution p=0 p=1 -- Binomial 0 Integer.MAX_VALUE Chisquare 0 Double.POSITIVE_INFINITY Exponential0 Double.POSITIVE_INFINITY F 0 Double.POSITIVE_INFINITY Gamma 0 Double.POSITIVE_INFINITY HyperGeometric 0 finite, parameter-dependent Normal Double.NEGATIVE_INFINITY Double.POSITIVE_INFINITY TDouble.NEGATIVE_INFINITY Double.POSITIVE_INFINITY Other than the value for Chisquare with p=1 (which causes R to hang), these values are consistent with what R returns using the q* functions. It might be more convenient to return Double.MAX_VALUE, -Double.MAX_VALUE in place of the INFINITY's (since then we could just use getDomainLowerBound at 0 and 1) but this would not be correct mathematically. If there are no objections, I will find a way to get the values above returned. +1 to the values in the table above. As a user I would prefer to be returned an infinity rather than MAX_VALUE where possible (it's too bad the integer types don't provide infinity values), because even though I would often recognize 1e+308 or thereabouts as Double.POSITIVE_INFINITY, I would still have to do that conversion mentally, and I would always wonder whether the returned value was actually MAX_VALUE or just the implementation-dependent representation of POSITIVE_INFINITY. Also consider what would happen if the data type were changed to float. Then if MAX_VALUE were used, the numeric value returned for p = 1 would differ depending on the data type. With the infinity values, although there's a class difference between Double.POSITIVE_INFINITY and Float.POSITIVE_INFINITY, the concept is clearly identical. It's strange that BigDecimal doesn't provide infinity values, though. Maybe that's something Commons should address at some point. Al __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Getting 1.0 out the door -- tasks remaining
J.Pietschmann wrote: Phil Steitz wrote: 1) Decide what to do about inverse cumulative probabilities where p = 1 (easy solution is to document and throw) Nearly +1 My own nearly +1 on this just turned to -1. After looking some more at the code and thinking some more, I think that both p=1 and p=0 should be handled correctly in all cases. The difficult cases are when the probability density function has unbounded support. Here is what I propose for the values of inverseCumulativeProbability() at p=0 and p=1 for current distributions. Unless otherwise noted, these values are intented to be independent of distribution parameters. Distribution p=0 p=1 -- Binomial 0 Integer.MAX_VALUE Chisquare 0 Double.POSITIVE_INFINITY Exponential0 Double.POSITIVE_INFINITY F 0 Double.POSITIVE_INFINITY Gamma 0 Double.POSITIVE_INFINITY HyperGeometric 0 finite, parameter-dependent Normal Double.NEGATIVE_INFINITY Double.POSITIVE_INFINITY TDouble.NEGATIVE_INFINITY Double.POSITIVE_INFINITY Other than the value for Chisquare with p=1 (which causes R to hang), these values are consistent with what R returns using the q* functions. It might be more convenient to return Double.MAX_VALUE, -Double.MAX_VALUE in place of the INFINITY's (since then we could just use getDomainLowerBound at 0 and 1) but this would not be correct mathematically. If there are no objections, I will find a way to get the values above returned. Phil - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Getting 1.0 out the door -- tasks remaining
J.Pietschmann wrote: 2) Decide what, if anything to do about the root-finding interfaces. I am OK releasing as is. Uh, oh! Does that mean that you think we need to change the interfaces. If so, how exactly? Along the lines that I suggested earlier (stateless, value objects returned)? Phil - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Getting 1.0 out the door -- tasks remaining
Phil Steitz wrote: root-finding interfaces. ... Does that mean that you think we need to change the interfaces. If so, how exactly? Along the lines that I suggested earlier (stateless, value objects returned)? Actually I don't know how to proceed. It would be nice to have a common pattern for to interfaces for solving non-linear equations (aka root finding), solving systems of linear equations, interpolation and perhaps some functions from the stat area. OTOH, each problem has some unique aspects and performance tradeoffs (in terms of copying stuff and perhaps more), and I have no good idea how to get this unified while keeping it reasonably simple and intuitive. Duh! J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Getting 1.0 out the door -- tasks remaining
Phil Steitz wrote: 1) Decide what to do about inverse cumulative probabilities where p = 1 (easy solution is to document and throw) Nearly +1 2) Decide what, if anything to do about the root-finding interfaces. I am OK releasing as is. Uh, oh! 4) Decide what to do about RealMatrix rank. Only reasonable solution at this point appears to be to drop it from the interface. I'd vote for dropping it. A robust implementation would require SVD, which is quite complex in itself, and I personally never found a real usage for a matrix rank unless it dropped out of a related computation as a side effect anyway. 6) Decide whether or not to add BigDecimalMatrix. I'm undecided; if the unit tests are up to a decent coverage, I think it could be included. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]