Hello,
I have run into a problem with using the RDKit to generate conformers of
molecules. I am using the following code:
from rdkit import Chem
from rdkit.Chem import AllChem
from timeit import default_timer as timer
def GenerateDGConfs(m,num_confs,rms):
start_time = timer()
ids = AllChem.EmbedMultipleConfs(m, numConfs=num_confs, pruneRmsThresh=rms,
maxAttempts=200,enforceChirality=True)
for id in ids:
AllChem.MMFFOptimizeMolecule(m, confId=id)
end_time = timer()
time_diff = end_time - start_time
# print ("Normal DG = %0.2f" % time_diff)
return m, list(ids), time_diff
w = Chem.SDWriter("%s/%s" % (rootdir,"My_conformers.sdf))
suppl = Chem.SDMolSupplier("%s/%s" % (rootdir,"My_molecules.sdf"))
num_confs = 200
rmsd = 0.5
for mol in suppl:
if mol is None: continue
Chem.AssignAtomChiralTagsFromStructure(mol)
mol1 = Chem.AddHs(mol)
conf_mol, id_list, time_diff = GenerateDGConfs(mol1,num_confs,rmsd)
num_confs = conf_mol.GetNumConformers()
for id in id_list:
w.write(conf_mol, confId=id)
w.flush()
w.close()
What I see from this is as I go through the molecules in the input file the
number of conformers returned declines monotonically, starting close to the 200
I set as a maximum to around 10 after a few thousand molecules have been
processed (this applies whether I use 'normal' DG or the ETKDG method. As I am
a new user of RDKit I am sure I missed something obvious but I cannot see it.
Also, once I generate the conformers what is best way to cluster them by RMSD
so that each conformer has a minimum RMSD to all the others in the set?
Any help would be gratefully received.
Paul.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss