[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2014-05-04 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989195#comment-13989195
 ] 

Phil Steitz commented on MATH-437:
--

Updated user guide and TestUtils in r1592430.  All that remains now is removing 
the deprecated classes in 4.0

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 4.0
>
> Attachments: MATH437-with-test-take-1, ks-distribution.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2014-02-26 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913713#comment-13913713
 ] 

Phil Steitz commented on MATH-437:
--

First cut added in r1572335.

I don't want to hold up 3.3 release for this, but pre- or post-3.3 still need 
to:

1. Update user guide
2. Add static methods to TestUtils


> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 4.0
>
> Attachments: MATH437-with-test-take-1, ks-distribution.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2014-02-20 Thread Luc Maisonobe (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907103#comment-13907103
 ] 

Luc Maisonobe commented on MATH-437:


Sure, we can wait for this issue to be fixed before releasing 3.3 if you think 
it can be done soon.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 4.0
>
> Attachments: MATH437-with-test-take-1, ks-distribution.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2014-02-20 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907012#comment-13907012
 ] 

Phil Steitz commented on MATH-437:
--

I have the first part of this - the new class and deprecation of the old - just 
about ready to commit.  I would like to slide this in 3.3 if I can have to the 
end of this week to finish it.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 4.0
>
> Attachments: MATH437-with-test-take-1, ks-distribution.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2013-03-27 Thread Luc Maisonobe (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615110#comment-13615110
 ] 

Luc Maisonobe commented on MATH-437:


+1 to postpone to 4.0.

I have no opinion about the usefulness of the distribution itself, so if you 
think it would be worth removing this part and having only the test, this would 
be fine with me.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 3.2
>
> Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2013-03-26 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614601#comment-13614601
 ] 

Phil Steitz commented on MATH-437:
--

I think we should bump this to 4.0 or at least 3.3. It was probably a mistake 
to put K-S in the distribution package.  The K-S distribution itself is of 
little practical usefulness (to my knowledge at least).  I have never seen it 
used for anything but performing K-S tests.  It is tricky enough to compute the 
distribution function itself with any kind of numerical stability, as the 
comments above and the literature around K-S tests confirm.  Computing moments 
is, as the reference where Luc (resourcefully!) found test data states, 
"intractable."  I think it may be best to steer clear of this and focus on just 
getting good implementation of the test itself, which should move to 
.inference.   I would prefer to do a little more research though to decide how 
best to set up the API and implementation for the test.  It could be we would 
be better off not using the cdfs in the current impl, instead using beta 
approximation to compute p-values as in [1].  Note also that since discussion 
above / initial implementation, Simard has published [2] with some empirical 
findings on how the various K-S approximation methods perform.

So to summarize, I think the first step is to agree on the K-S test API.  Then 
deprecate the class in .distribution and move the test class to .inference.

[1]http://www.ism.ac.jp/editsec/aism/pdf/054_3_0577.pdf
[2] http://www.jstatsoft.org/v39/i11/paper

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Phil Steitz
>Priority: Minor
> Fix For: 3.2
>
> Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution

2013-03-16 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604413#comment-13604413
 ] 

Phil Steitz commented on MATH-437:
--

Pushing this to top of my [math] list.  The patch makes great use of our 
numerics :) but I have to think there are better ways to compute these things.  
Also, we need to separate the test class, which should go into .stat.inference 
from the distribution class.  If this is the last bug holding up 3.2, feel free 
to bump to 4.0 or 3.3.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Priority: Minor
> Fix For: 3.2
>
> Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2011-03-20 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008936#comment-13008936
 ] 

Phil Steitz commented on MATH-437:
--

+1 to commit as is, adding some algorithm notes to the class javadoc and the 
MATH-435 power impl.  I am ambivalent on whether or not to "fix" the error in 
Marsaglia's code that is apparently included in R.  Having the verification 
tests is good, though, so I would leave as is in the patch, since the Marsaglia 
C impl can be seen as a reference in this case.  I can see the other side of 
the argument here, though and would be fine with just going with the fixed 
code, suitably documented.  What do others think about this?

It looks like you forgot to add the references to the class javadoc for the 
impl class.

Per comment on MATH-435, I think we should add the matrix power impl there and 
use it here.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Fix For: 3.0
>
> Attachments: MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2010-12-29 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12975771#action_12975771
 ] 

Mikkel Meyer Andersen commented on MATH-437:


In the past months, I've communicated with both Richard Simard and George 
Marsaglia regarding small disagreement between theory in Marsaglia's article 
and the actual implementation; namely the fact that 0 <= h < 1, but in the code 
0 < h <= 1. I wrote to Marsaglia regarding this, and his answer was: 
{quote}
The Kolmogorov distribution comes from a piecewise polynomial in h with knots 
at 1/2n, 2/2n,...,(2n-1)/2n,  with each segment assumed to start with h=0. 
Although I emphasized that 0<= h <1 in the article,  I overlooked the need for 
ensuring that in the C code, and apparently so did my colleagues. Sorry about 
that.
{quote}
This means that his code has to be changed slightly to ensure that 0 <= h < 1. 
Simard argues that this shouldn't mean anything because KS distribution is 
continuous, but if one wants to correct it, one should
{quote}
Instead of taking the floor(n*d + 1) and making this correction for h = 1, take 
the ceiling (n*d).
{quote}

I would prefer using ceiling (n*d) instead of the originally (wrongly) proposed 
floor(n*d + 1), despite arguments of continuity. So my plan is to do this (I 
still have my implementation which seem to work quite okay). The only problem 
is that R seems to use Marsaglia's code, and I don't have access to e.g. 
Mathematica which should implement several algorithms, so I might run into 
problems when I have to perform tests.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Fix For: 3.0
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2010-11-17 Thread Richard Simard (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933166#action_12933166
 ] 

Richard Simard commented on MATH-437:
-

If you used the same x in all 3 cases, I believe there is a bug in your exact 
and not exact codes because you get
only 2 decimal digits of precision.

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2010-11-17 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933161#action_12933161
 ] 

Mikkel Meyer Andersen commented on MATH-437:


Richard,
Thanks for your knowledgeable comment. To quote you:
{quote}"The argument x that you used in the Simard-L'écuyer program is not the 
same that you used for the other two programs."{quote}
I'm not sure what you mean by that? Which argument should I then use to expect 
the same result?

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2010-11-17 Thread Richard Simard (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933077#action_12933077
 ] 

Richard Simard commented on MATH-437:
-

   http://www.mail-archive.com/issues@commons.apache.org/msg15829.html

F(n, x) = F(200, 0.03):
 Lecuyer (2.0 ms.) = 0.012916146481628863
 KolmogorovSmirnovDistribution exact (51902.0 ms.) = 0.012149763742041911
KolmogorovSmirnovDistribution !exact (9.0 ms.) = 0.012149763742041922


The argument x that you used in the Simard-L'écuyer program is not the same 
that you used for the other two programs. Of course you then get
very different results. If I compute exactly in Mathematica, I obtain

F(200, 0.03) = 0.0129161464816289

which is very different than your exact results above and agrees well with our 
program.



=
  Richard Simard
  Laboratoire de simulation et d'optimisation
  Université de Montréal, IRO



> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-437) Kolmogorov Smirnov Distribution

2010-11-15 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932361#action_12932361
 ] 

Mikkel Meyer Andersen commented on MATH-437:


The last part of the roundedK kan be replaced with
{{
double pFrac = Hpower.getEntry(k - 2, k - 2);

for (int i = 1; i <= n; ++i) {
pFrac *= (double)i / (double)n;
}

return pFrac;
}}
to get even better running time and still precise results:
{{
F(n, x) = F(200, 0.02):
 Lecuyer (3.0 ms.) = 5.151982014280042E-6
   KolmogorovSmirnovDistribution exact (760.0 ms.) = 5.15198201428005E-6
   KolmogorovSmirnovDistribution !exact (16.0 ms.) = 5.151982014280049E-6
-


F(n, x) = F(200, 0.03):
 Lecuyer (2.0 ms.) = 0.012916146481628863
 KolmogorovSmirnovDistribution exact (51902.0 ms.) = 0.012149763742041911
KolmogorovSmirnovDistribution !exact (9.0 ms.) = 0.012149763742041922
-


F(n, x) = F(200, 0.04):
 Lecuyer (0.0 ms.) = 0.1067121882956352
  KolmogorovSmirnovDistribution exact (5903.0 ms.) = 0.10671370113626812
KolmogorovSmirnovDistribution !exact (6.0 ms.) = 0.10671370113626813
-
}}

> Kolmogorov Smirnov Distribution
> ---
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Mikkel Meyer Andersen
>Assignee: Mikkel Meyer Andersen
>Priority: Minor
> Attachments: KolmogorovSmirnovDistribution.java
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.