Re: statistics library?
On Oct 10, 2011, at 4:36 PM, Ben Evans wrote: > There should be 1.2.4 (and a snapshot of 1.3.0) up on clojars now. > > Could I ask you to give one of them a go, and mail your findings to > the list? We have our regular Incanter Hack Day coming up next > weekend, so if things are still b0rken for you, I can try to find a > developer to look at the problem for you at the Hack day. > Searching for incanter at clojars I find only one 1.2.4 item: incanter/incanter-latex 1.2.4 -- I don't think this is what I want... Is it? I just want the statistics functions, not anything having to do with latex. I see the 1.3 snapshot, but when I try it by including [incanter "1.3.0-SNAPSHOT"] in my project.clj dependencies "lein deps" fails with: --- Unable to resolve artifact: Missing: -- 1) incanter:incanter-latex:jar:1.3.0-SNAPSHOT --- So I haven't yet been able to actually try the statistical tests. -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Hi Lee, On Wed, Sep 28, 2011 at 12:43 AM, Lee Spector wrote: > On Sep 27, 2011, at 5:44 PM, David Powell wrote: > >> I see that there was a recent fix made to Incanter: >> >> Fixed typo in :lower-tail? keyword. >> This was causing the complement of the p-value to be returned. >> >> https://github.com/liebke/incanter/pull/39 >> >> Have you tried the latest version in git? Does this fix the problem? > > Hmm. I had asked about the version on the Incanter list too. I now see that I > was using a *newer* version than the newest one at > https://github.com/liebke/incanter. > > I grabbed what appeared to be the newest on clojars, which is [incanter > "1.2.3"], while the newest download on that github project page appears to be > 1.2.2 from April 20, 2010. > > It does sound like the comment that you quoted might indeed be about the bug > that I ran into, so maybe it's fixed in some version of Incanter somewhere... There should be 1.2.4 (and a snapshot of 1.3.0) up on clojars now. Could I ask you to give one of them a go, and mail your findings to the list? We have our regular Incanter Hack Day coming up next weekend, so if things are still b0rken for you, I can try to find a developer to look at the problem for you at the Hack day. Thanks, Ben -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Depending on the project (and I don't know if it's still supported in 1.3), you ought to be able to leverage Mathematica Player with Clojuratica for more powerful operations. On Sep 27, 6:43 pm, Lee Spector wrote: > On Sep 27, 2011, at 5:44 PM, David Powell wrote: > > > I see that there was a recent fix made to Incanter: > > > Fixed typo in :lower-tail? keyword. > > This was causing the complement of the p-value to be returned. > > >https://github.com/liebke/incanter/pull/39 > > > Have you tried the latest version in git? Does this fix the problem? > > Hmm. I had asked about the version on the Incanter list too. I now see that I > was using a *newer* version than the newest one > athttps://github.com/liebke/incanter. > > I grabbed what appeared to be the newest on clojars, which is [incanter > "1.2.3"], while the newest download on that github project page appears to be > 1.2.2 from April 20, 2010. > > It does sound like the comment that you quoted might indeed be about the bug > that I ran into, so maybe it's fixed in some version of Incanter somewhere... > But for my current purposes I have more faith in > [org.apache.commons/commons-math "2.0"]. > > Thanks, > > -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
On Sep 27, 2011, at 5:44 PM, David Powell wrote: > I see that there was a recent fix made to Incanter: > > Fixed typo in :lower-tail? keyword. > This was causing the complement of the p-value to be returned. > > https://github.com/liebke/incanter/pull/39 > > > Have you tried the latest version in git? Does this fix the problem? Hmm. I had asked about the version on the Incanter list too. I now see that I was using a *newer* version than the newest one at https://github.com/liebke/incanter. I grabbed what appeared to be the newest on clojars, which is [incanter "1.2.3"], while the newest download on that github project page appears to be 1.2.2 from April 20, 2010. It does sound like the comment that you quoted might indeed be about the bug that I ran into, so maybe it's fixed in some version of Incanter somewhere... But for my current purposes I have more faith in [org.apache.commons/commons-math "2.0"]. Thanks, -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
> Again, if I understand correctly, under no circumstances should the p-value > ever be outside of the range from 0 to 1. It's a probability, and no value > outside of that range makes any sense. But Incanter sometimes returns > p-values greater than 1. I see that there was a recent fix made to Incanter: Fixed typo in :lower-tail? keyword. This was causing the complement of the p-value to be returned. https://github.com/liebke/incanter/pull/39 Have you tried the latest version in git? Does this fix the problem? -- Dave -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Yes, those errors in Incanter are unfortunate. I had another weird one occurred which David Liebke attributed to the underlying Colt library. user=> (sd (repeat 9 0.65)) NaN The sd function calls the variance function, which calls a function in Colt; the trouble is Colt is returning a number very, very close to zero, but just a bit under (ie it's negative) user=> (variance (repeat 9 0.65)) -1.1102230246251565E-16 and the sqrt of a negative number is NaN. On , Lee Spector wrote: I need to do some pretty simple statistics in a Clojure program and Incanter produces results that I think must be wrong (details below). So I don't think I can trust it. Is there other code for statistical testing out there? Or maybe somebody could explain to me how to interpret the seemingly anomalous Incanter results? (I received no reply on the Incanter list). I only need a t-test at the moment, but this is a bit of a pain to code from scratch (because of the table that it uses). I'm trying to use an un-paired, two-tailed t-test to tell whether the means of two sets of numbers differ significantly. (Whether or not this is the right test for my application -- eg whether the assumptions of normal distributions are valid -- is another matter. I just want to know it the tests are being calculated correctly.) If I understand correctly the t-test should produce a p-value which ranges from 0 to 1. If it's less than 0.05 we can say that the means differ. (Again, there would be more to say here about what's statistically meaningful, but that discussion isn't relevant to my question). Again, if I understand correctly, under no circumstances should the p-value ever be outside of the range from 0 to 1. It's a probability, and no value outside of that range makes any sense. But Incanter sometimes returns p-values greater than 1. Sometimes it seems to give reasonable results: => (use 'incanter.stats) nil => (t-test [2 3 4 3 2 3] :y [3 4 5 6 5 4 3]) {:conf-int [-2.6129722457891322 -0.2917896589727722], :x-mean 2.8335, :t-stat -2.7883256115163184, :p-value 0.018335366451909547, :n1 6, :df 10.519255193727584, :n2 7, :y-var 1.2380952380952408, :x-var 0.5658, :y-mean 4.285714285714286} But in other cases the :p-value is over 1. Here's an example from Incanter's own documentation: => (t-test (range 1 11) :mu 0) {:conf-int [3.33414941027723 7.66585058972277], :x-mean 5.5, :t-stat 5.744562646538029, :p-value 1.9997218039889517, :n1 10, :df 9, :n2 nil, :y-var nil, :x-var 9.166, :y-mean nil} Here's an example that's closer to what can arise in my application, and again I just don't see how the calculation can be right if it's producing this kind of p-value: => (t-test '(40 5 2) :y '(1 5 1)) {:conf-int [-39.46068349230474 66.12735015897141], :x-mean 15.666, :t-stat 1.0866516498483223, :p-value 1.6115506955016772, :n1 3, :df 2.0477900396893336, :n2 3, :y-var 5.332, :x-var 446.37, :y-mean 2.3335} Am I missing something that would rationalize these results? If not, then does anyone have a pointer to more reliable statistics code in Clojure? Or pointers to using a Java library? I see that there are libraries out there -- eg http://commons.apache.org/math/api-1.2/org/apache/commons/math/stat/inference/TTest.html -- but Java interop is not my strong suit and I'm not sure how to call this from my Clojure code. Any pointers would be appreciated. Thanks, -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
On Sep 27, 2011, at 1:37 PM, Johann Hibschman wrote: > Johann Hibschman writes: > >> There may be an easier way to do this, but this worked for me: >> >> user=> (org.apache.commons.math.stat.inference.TestUtils/tTest >>(into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1])) >> 0.3884493044983227 > > I should have used (double-array [40 5 2]) here, but for some reason I > couldn't remember it until I hit send. Hooray! This is beautiful. I had to tinker a bit to find/download the library jars and get my runtime environment to find them, but then this did exactly what I wanted. Thanks so much for the confirmation and solution! -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Johann Hibschman writes: > There may be an easier way to do this, but this worked for me: > > user=> (org.apache.commons.math.stat.inference.TestUtils/tTest > (into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1])) > 0.3884493044983227 I should have used (double-array [40 5 2]) here, but for some reason I couldn't remember it until I hit send. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Lee Spector writes: > I need to do some pretty simple statistics in a Clojure program and > Incanter produces results that I think must be wrong (details > below). So I don't think I can trust it. I agree, those all look weird to me. > Is there other code for statistical testing out there? I'd reach for commons-math, but I don't have much experience. > If I understand correctly the t-test should produce a p-value which > ranges from 0 to 1. If it's less than 0.05 we can say that the means > differ. (Again, there would be more to say here about what's > statistically meaningful, but that discussion isn't relevant to my > question). This is true. > => (t-test (range 1 11) :mu 0) > {:conf-int [3.33414941027723 7.66585058972277], > :x-mean 5.5, > :t-stat 5.744562646538029, > :p-value 1.9997218039889517, > :n1 10, > :df 9, > :n2 nil, > :y-var nil, > :x-var 9.166, > :y-mean nil} This looks wrong to me. At least according to R, the p-value is 0.00278. Interestingly, this is 2 - [incanter's p]. > => (t-test '(40 5 2) :y '(1 5 1)) > {:conf-int [-39.46068349230474 66.12735015897141], > :x-mean 15.666, > :t-stat 1.0866516498483223, > :p-value 1.6115506955016772, > :n1 3, > :df 2.0477900396893336, > :n2 3, > :y-var 5.332, > :x-var 446.37, > :y-mean 2.3335} R gives 0.3884, which is again 2 - [incanter's p]. Fishy. I would say that there's a bug in Incanter's distribution function, at least when calculating values in the tails. > If not, then does anyone have a pointer to more reliable statistics > code in Clojure? Or pointers to using a Java library? I see that there > are libraries out there -- > e.g. > http://commons.apache.org/math/api-1.2/org/apache/commons/math/stat/inference/TTest.html > -- but Java interop is not my strong suit and I'm not sure how to call > this from my Clojure code. There may be an easier way to do this, but this worked for me: user=> (org.apache.commons.math.stat.inference.TestUtils/tTest (into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1])) 0.3884493044983227 Hope that helps, Johann -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
statistics library?
I need to do some pretty simple statistics in a Clojure program and Incanter produces results that I think must be wrong (details below). So I don't think I can trust it. Is there other code for statistical testing out there? Or maybe somebody could explain to me how to interpret the seemingly anomalous Incanter results? (I received no reply on the Incanter list). I only need a t-test at the moment, but this is a bit of a pain to code from scratch (because of the table that it uses). I'm trying to use an un-paired, two-tailed t-test to tell whether the means of two sets of numbers differ significantly. (Whether or not this is the right test for my application -- e.g. whether the assumptions of normal distributions are valid -- is another matter. I just want to know it the tests are being calculated correctly.) If I understand correctly the t-test should produce a p-value which ranges from 0 to 1. If it's less than 0.05 we can say that the means differ. (Again, there would be more to say here about what's statistically meaningful, but that discussion isn't relevant to my question). Again, if I understand correctly, under no circumstances should the p-value ever be outside of the range from 0 to 1. It's a probability, and no value outside of that range makes any sense. But Incanter sometimes returns p-values greater than 1. Sometimes it seems to give reasonable results: => (use 'incanter.stats) nil => (t-test [2 3 4 3 2 3] :y [3 4 5 6 5 4 3]) {:conf-int [-2.6129722457891322 -0.2917896589727722], :x-mean 2.8335, :t-stat -2.7883256115163184, :p-value 0.018335366451909547, :n1 6, :df 10.519255193727584, :n2 7, :y-var 1.2380952380952408, :x-var 0.5658, :y-mean 4.285714285714286} But in other cases the :p-value is over 1. Here's an example from Incanter's own documentation: => (t-test (range 1 11) :mu 0) {:conf-int [3.33414941027723 7.66585058972277], :x-mean 5.5, :t-stat 5.744562646538029, :p-value 1.9997218039889517, :n1 10, :df 9, :n2 nil, :y-var nil, :x-var 9.166, :y-mean nil} Here's an example that's closer to what can arise in my application, and again I just don't see how the calculation can be right if it's producing this kind of p-value: => (t-test '(40 5 2) :y '(1 5 1)) {:conf-int [-39.46068349230474 66.12735015897141], :x-mean 15.666, :t-stat 1.0866516498483223, :p-value 1.6115506955016772, :n1 3, :df 2.0477900396893336, :n2 3, :y-var 5.332, :x-var 446.37, :y-mean 2.3335} Am I missing something that would rationalize these results? If not, then does anyone have a pointer to more reliable statistics code in Clojure? Or pointers to using a Java library? I see that there are libraries out there -- e.g. http://commons.apache.org/math/api-1.2/org/apache/commons/math/stat/inference/TTest.html -- but Java interop is not my strong suit and I'm not sure how to call this from my Clojure code. Any pointers would be appreciated. Thanks, -Lee -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en