I didn't want to file a suggestion for a javadoc patch without hearing from
someone who knows a bit more about the math history behind it because I
didn't want to suggest something that may be in error. When I checked the
Wikipedia article on it, the article noted that there was confusion an
inconsistency between papers as to what Tanimoto actually was and how it
compared to Jaccard. So, I went to the primary source for Jaccard and am
getting the primary source for Tanimoto when/if interlibrary loan comes
through.


On Mon, Apr 8, 2013 at 12:04 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> I don't see the problem here.  We only want to compare two items so
> Jaccard and Tanimoto are identical.
>
> Could you file a JIRA and suggest a javadoc patch?
>
> Why did this take you to an ancient journal instead of Wikipedia?
>
>
> On Apr 7, 2013, at 6:54 AM, James Endicott wrote:
>
> > As far as I can tell, the difference between the two is that the Jaccard
> > similarity can only be used to compare two items using the formula:
> > items appearing in both documents/(items just appearing in one + items
> just
> > appearing in the other + items appearing in both)
> > But the Tanimoto similarity measure allows for comparing between any
> number
> > of items by generalizing the formula to:
> > items appearing in all documents/(items just appearing in one + items
> just
> > appearing in another + ... + items appearing in some but not all + ... +
> > items appearing in all)
> >
> > I think the class could be generalized to implement the full Tanimoto
> > similarity without too much difficulty (though I don't think it's a high
> > priority) but at the moment it does not do so. While I realize this is
> > probably a trivial matter, I hope the docs get updated at some point so
> > another grad student doesn't have to muddle through a botany article in a
> > Swiss journal from 1901 again.
>
>

Reply via email to