[jira] [Commented] (MATH-814) Kendalls Tau Implementation

2013-10-31 Thread Thomas Neidhart (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810722#comment-13810722
 ] 

Thomas Neidhart commented on MATH-814:
--

Hi Matt,

added your patch in r1537660 with some modifications:

 * use FastMath instead of Math
 * javadoc formatting and linewidth
 * use same class interface as other correlations
 * use existing Pair class instead of ComparablePair
 * simplify the correlation method to only support double[] atm (may be 
extended if needed)
 * added testcases for longley and swiss fertility data sets based on our 
correlation testsuite for R

Thanks a lot for your contribution!

The other points wrt a commons base class / interface are perfectly valid and I 
would be very much in favor. It should be fairly easy to introduce an abstract 
base class Correlation for the 3 implementations that we have right now.

btw. we prefer abstract base classes over interfaces, as they are more or less 
the same as interfaces, but make it possible to extend without breaking 
compatibility.

We should create a separate issue for this, feel free to work on this already 
and provide a patch if you are interested.

 Kendalls Tau Implementation
 ---

 Key: MATH-814
 URL: https://issues.apache.org/jira/browse/MATH-814
 Project: Commons Math
  Issue Type: New Feature
Affects Versions: 4.0
 Environment: All
Reporter: devl
Assignee: Phil Steitz
  Labels: correlation, rank
 Fix For: 4.0

 Attachments: kendalls-tau.patch

   Original Estimate: 840h
  Remaining Estimate: 840h

 Implement the Kendall's Tau which is a measure of Association/Correlation 
 between ranked ordinal data.
 A basic description is available at 
 http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however 
 the test implementation will follow that defined by Handbook of Parametric 
 and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, 
 ISBN-10: 1439858012 | ISBN-13: 978-1439858011.
 The algorithm is proposed as follows. 
 Given two rankings or permutations represented by a 2D matrix; columns 
 indicate rankings (e.g. by an individual) and row are observations of each 
 rank. The algorithm is to calculate the total number of concordant pairs of 
 ranks (between columns), discordant pairs of ranks  (between columns) and 
 calculate the Tau defined as
 tau= (Number of concordant - number of discordant)/(n(n-1)/2)
  where n(n-1)/2 is the total number of possible pairs of ranks.
 The method will then output the tau value between -1 and 1 where 1 signifies 
 a perfect correlation between the two ranked lists. 
 Where ties exist within a ranking it is marked as neither concordant nor 
 discordant in the calculation. An optional merge sort can be used to speed up 
 the implementation. Details are in the wiki page.
 Although this implementation is not particularly complex it would be useful 
 to have it in a consistent format in the commons math package in addition to 
 existing correlation tests. Kendall's Tau is used effectively in comparing 
 ranks for products, rankings from search engines or measurements from 
 engineering equipment.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MATH-814) Kendalls Tau Implementation

2013-10-29 Thread Matt Adereth (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808595#comment-13808595
 ] 

Matt Adereth commented on MATH-814:
---

Apologies in advance for not being consistent with SpearmansCorrelation.  I 
couldn't bring myself to make these methods non-static or to use the approach 
of having the constructor take the data.  If this is a problem, I have no issue 
making the requisite change.

Another consideration that I didn't do, but would be willing to try, would be 
to make a new Correlation interface that exposes the methods that should be 
common to Spearmans, Kendalls, and Pearsons.  It would probably make sense to 
then have an AbstractCorrelation base class that handles the sheparding of data 
between Matrix, double[][], and double[], double[].

Finally, if you do think a Correlation interface makes sense, I'd also like to 
propose a NonParametricCorrelation interface for Spearmans and Kendalls which 
would have an additional method for computing the Correlation between two 
ListComparable objects.

 Kendalls Tau Implementation
 ---

 Key: MATH-814
 URL: https://issues.apache.org/jira/browse/MATH-814
 Project: Commons Math
  Issue Type: New Feature
Affects Versions: 4.0
 Environment: All
Reporter: devl
Assignee: Phil Steitz
  Labels: correlation, rank
 Fix For: 4.0

 Attachments: kendalls-tau.patch

   Original Estimate: 840h
  Remaining Estimate: 840h

 Implement the Kendall's Tau which is a measure of Association/Correlation 
 between ranked ordinal data.
 A basic description is available at 
 http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however 
 the test implementation will follow that defined by Handbook of Parametric 
 and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, 
 ISBN-10: 1439858012 | ISBN-13: 978-1439858011.
 The algorithm is proposed as follows. 
 Given two rankings or permutations represented by a 2D matrix; columns 
 indicate rankings (e.g. by an individual) and row are observations of each 
 rank. The algorithm is to calculate the total number of concordant pairs of 
 ranks (between columns), discordant pairs of ranks  (between columns) and 
 calculate the Tau defined as
 tau= (Number of concordant - number of discordant)/(n(n-1)/2)
  where n(n-1)/2 is the total number of possible pairs of ranks.
 The method will then output the tau value between -1 and 1 where 1 signifies 
 a perfect correlation between the two ranked lists. 
 Where ties exist within a ranking it is marked as neither concordant nor 
 discordant in the calculation. An optional merge sort can be used to speed up 
 the implementation. Details are in the wiki page.
 Although this implementation is not particularly complex it would be useful 
 to have it in a consistent format in the commons math package in addition to 
 existing correlation tests. Kendall's Tau is used effectively in comparing 
 ranks for products, rankings from search engines or measurements from 
 engineering equipment.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MATH-814) Kendalls Tau Implementation

2012-07-10 Thread devl (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410709#comment-13410709
 ] 

devl commented on MATH-814:
---

Initial feedback from Phil Steitz

I think a Kendal's Tau implementation would make a great addition to
the correlation package (o.a.c.math3.stat.correlation).  Here is how
you can get started:

0) Get yourself set up to build commons math and run the unit
tests.  If you are familiar with maven, this should not be too
hard.  If you have any questions or run into problems checking out
the sources, building locally, etc., don't hesitate to ask.
1) Look at the Spearman's implementation and the ranking classes in
the stat.ranking package.  That might give you some ideas on how to
implement Kendal's consistently.
2) Open a JIRA ticket with the info above and start attaching
patches implementing the new implementation class and associated
test class.  Run mvn site or checkstyle standalone to make sure
your contributed code follows the style guidelines we use.
3) Be patient but persistent and we will get Kendall's Tau into
commons math :)

 Kendalls Tau Implementation
 ---

 Key: MATH-814
 URL: https://issues.apache.org/jira/browse/MATH-814
 Project: Commons Math
  Issue Type: New Feature
Affects Versions: 4.0
 Environment: All
Reporter: devl
  Labels: correlation, rank
 Fix For: 4.0

   Original Estimate: 840h
  Remaining Estimate: 840h

 Implement the Kendall's Tau which is a measure of Association/Correlation 
 between ranked ordinal data.
 A basic description is available at 
 http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however 
 the test implementation will follow that defined by Handbook of Parametric 
 and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, 
 ISBN-10: 1439858012 | ISBN-13: 978-1439858011.
 The algorithm is proposed as follows. 
 Given two rankings or permutations represented by a 2D matrix; columns 
 indicate rankings (e.g. by an individual) and row are observations of each 
 rank. The algorithm is to calculate the total number of concordant pairs of 
 ranks (between columns), discordant pairs of ranks  (between columns) and 
 calculate the Tau defined as
 tau= (Number of concordant - number of discordant)/(n(n-1)/2)
  where n(n-1)/2 is the total number of possible pairs of ranks.
 The method will then output the tau value between -1 and 1 where 1 signifies 
 a perfect correlation between the two ranked lists. 
 Where ties exist within a ranking it is marked as neither concordant nor 
 discordant in the calculation. An optional merge sort can be used to speed up 
 the implementation. Details are in the wiki page.
 Although this implementation is not particularly complex it would be useful 
 to have it in a consistent format in the commons math package in addition to 
 existing correlation tests. Kendall's Tau is used effectively in comparing 
 ranks for products, rankings from search engines or measurements from 
 engineering equipment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira