[ 
https://issues.apache.org/jira/browse/MATH-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810722#comment-13810722
 ] 

Thomas Neidhart commented on MATH-814:
--------------------------------------

Hi Matt,

added your patch in r1537660 with some modifications:

 * use FastMath instead of Math
 * javadoc formatting and linewidth
 * use same class interface as other correlations
 * use existing Pair class instead of ComparablePair
 * simplify the correlation method to only support double[] atm (may be 
extended if needed)
 * added testcases for longley and swiss fertility data sets based on our 
correlation testsuite for R

Thanks a lot for your contribution!

The other points wrt a commons base class / interface are perfectly valid and I 
would be very much in favor. It should be fairly easy to introduce an abstract 
base class Correlation for the 3 implementations that we have right now.

btw. we prefer abstract base classes over interfaces, as they are more or less 
the same as interfaces, but make it possible to extend without breaking 
compatibility.

We should create a separate issue for this, feel free to work on this already 
and provide a patch if you are interested.

> Kendalls Tau Implementation
> ---------------------------
>
>                 Key: MATH-814
>                 URL: https://issues.apache.org/jira/browse/MATH-814
>             Project: Commons Math
>          Issue Type: New Feature
>    Affects Versions: 4.0
>         Environment: All
>            Reporter: devl
>            Assignee: Phil Steitz
>              Labels: correlation, rank
>             Fix For: 4.0
>
>         Attachments: kendalls-tau.patch
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> Implement the Kendall's Tau which is a measure of Association/Correlation 
> between ranked ordinal data.
> A basic description is available at 
> http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however 
> the test implementation will follow that defined by "Handbook of Parametric 
> and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, 
> ISBN-10: 1439858012 | ISBN-13: 978-1439858011."
> The algorithm is proposed as follows. 
> Given two rankings or permutations represented by a 2D matrix; columns 
> indicate rankings (e.g. by an individual) and row are observations of each 
> rank. The algorithm is to calculate the total number of concordant pairs of 
> ranks (between columns), discordant pairs of ranks  (between columns) and 
> calculate the Tau defined as
> tau= (Number of concordant - number of discordant)/(n(n-1)/2)
>  where n(n-1)/2 is the total number of possible pairs of ranks.
> The method will then output the tau value between -1 and 1 where 1 signifies 
> a "perfect" correlation between the two ranked lists. 
> Where ties exist within a ranking it is marked as neither concordant nor 
> discordant in the calculation. An optional merge sort can be used to speed up 
> the implementation. Details are in the wiki page.
> Although this implementation is not particularly complex it would be useful 
> to have it in a consistent format in the commons math package in addition to 
> existing correlation tests. Kendall's Tau is used effectively in comparing 
> ranks for products, rankings from search engines or measurements from 
> engineering equipment.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to