When using GetMorganFingerprintAsBitVect I get the “expected” Tanimoto score
mol1 = Chem.MolFromSmiles('CCC')
mol2 = Chem.MolFromSmiles('CNC')
fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1,2,nBits=1024)
fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2,2,nBits=1024)
print(DataStructs.TanimotoSimilarity(fp1, fp2))
arr1 = np.zeros((1,))
DataStructs.ConvertToNumpyArray(fp1, arr1)
arr2 = np.zeros((1,))
DataStructs.ConvertToNumpyArray(fp2, arr2)
print(np.sum(arr1*arr2)/np.sum(arr1+arr2-arr1*arr2))
0.14285714285714285
0.14285714285714285
However, when using GetMorganFingerprint I get a difference score.
fp1 = AllChem.GetMorganFingerprint(mol1,2)
fp2 = AllChem.GetMorganFingerprint(mol2,2)
print(DataStructs.TanimotoSimilarity(fp1, fp2))
0.2
I thought the Tanimoto score was always computed using bit vectors. Can anyone
explain?
Best regards, Jan
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss