Dear Eric,

Sure, if fingerprints are not stable over time, some people who check things very
carefully (as you did) will have some surprises.
This being said, you should probably be using InChI keys, if you want a hash
for each molecule.

Regards,
F.

On 13/01/2023 06:37, Eric Jonas wrote:
Hello! I use the crc of morgan fingerprints as a quick-and-dirty way
to keep track of different molecules, but now I realize it might have
been too quick and dirty! In particular, there appears to have been a
change in the morgan code sometime between 2021.09.02 and 2022.03.05.
The following code produces different output under these versions:

import rdkit.Chem
import pickle
from rdkit import Chem

import rdkit.Chem.rdMolDescriptors
import zlib

def get_morgan4_crc32(m):
    mf = Chem.rdMolDescriptors.GetHashedMorganFingerprint(m, 4)
    morgan4_crc32 = zlib.crc32(mf.ToBinary())
    return morgan4_crc32

mol = Chem.AddHs(Chem.MolFromSmiles('Oc1cc(O)c(O)c(O)c1'))
print(get_morgan4_crc32(mol))

2021.09.2 : 1567135676

2022.03.5 : 204854560

I tried looking at the release notes but I didn't seem to see any
breaking changes (I might have missed them!) and I tried looking at
"blame" for the relevant source but didn't see any
seemingly-substantive changes within the relevant timeframe.

So am I doing something crazy here, or did something change
deliberately, or is it possible this is a bug?

...E
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to