Re: [Scikit-learn-general] Jaccard Index

2016-05-11 Thread Shishir Pandey
Thanks for your reply. I get it now. The all zeros case implies that the two sets are empty. Which is a 0/0 situation. Hence, it is taken to be 1. -- sp On Mon, May 9, 2016 at 10:11 PM, Maniteja Nandana < maniteja.modesty...@gmail.com> wrote: > > On 9 May 2016 9:47 pm, "Shishir Pandey" wrote: >

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Maniteja Nandana
On 9 May 2016 9:47 pm, "Shishir Pandey" wrote: > > From what you are saying isn't the Jaccard distance for the multi-class case equivalent to the (1-hammingloss). Where the hamming loss is the average of places where the two vectors are different. > Yeah, from what I can understand you are right.

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Shishir Pandey
>From what you are saying isn't the Jaccard distance for the multi-class case equivalent to the (1-hammingloss). Where the hamming loss is the average of places where the two vectors are different. I want to understand what do your examples represent? Could you give an example where the dimension

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Maniteja Nandana
On 9 May 2016 5:24 pm, "Shishir Pandey" wrote: > > This is what I am having trouble understanding. What does each dimension of the vector represent? I am thinking of it as follows: > > [label_1, label_2, ..., label_N] > > a characteristic vector would be something like [1, 1, 0, ..., 1, 0, 0] > >

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Alan Isaac
On 5/9/2016 7:53 AM, Shishir Pandey wrote: > A 0 [in both of] the two sets would represent that the label is not present > in either of the sets and hence the union would be smaller than the dimension > of the vector. Yes I agree; that would constitute a standard definition. Alan Isaac -

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Shishir Pandey
This is what I am having trouble understanding. What does each dimension of the vector represent? I am thinking of it as follows: [label_1, label_2, ..., label_N] a characteristic vector would be something like [1, 1, 0, ..., 1, 0, 0] This represents weather label_i is present in the set or not?

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Bharat Didwania 4-Yr B.Tech. Electrical Engg.
Hi, jaccard similarity coefficient or score is the ratio of size of intersection to the size of union of the to label sets . In this case the size of union is 4 and that of intersection is 2 . Hence the jaccard similarity score will be 2/4=0.5. I hope this will help. Regards, Bharat. On Mon, M

Re: [Scikit-learn-general] Jaccard Index

2016-05-09 Thread Maniteja Nandana
Hi, If I understand it correctly, the jaccard similarity is the ratio of number of matching outputs to the total number of outputs in case of binary and multiclass classification. Here, the first and the last outputs are matching among the four outputs, hence the jaccard score is 2/4=0.5. I hope

[Scikit-learn-general] Jaccard Index

2016-05-09 Thread Shishir Pandey
I a bit confused regarding the Jaccard similarity score. The example given on : http://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_similarity_score.html#sklearn.metrics.jaccard_similarity_score >>> import numpy as np>>> from sklearn.metrics import >>> jaccard_similarity_sco