Thank you, Sereina.I understand importance of addition of hydrogens to get a reasonable 3D coordinates. But the situation may be not that simple.
1. Addition of hydrogen is only required for custom coordinates supplied from an external file. If coordinates of a template is generated with rdkit embedding it works without addition of explicit hydrogens.
2. I found an opposite example where addition of hydrogens breaks constrained embedding if custom coordinates of a template is used. And again if I generate coordinates of a template by rdkit everything is OK without addition of Hs.
These suggest that there is some issue with custom coordinates usage for constrained embedding.
I provided the code and output below.
Code:
data = [('1.mol', 'C[C@@H]1CCCCC1=O', 'C[C@@H]1CC[C@H](O)CC1=O'),
('2.mol', 'CCCCCCCC[C@@H](CCC)NC(=O)c1ccc(F)cc1',
'CCCC[C@H](CCC[C@@H](CCC)NC(=O)c1ccc(F)cc1)NC(=O)c1ccco1')]
for i, (mol_fname, smi_template, smi_child) in enumerate(data):
print('iteration', i)
mode = 'read template mol file, no AddHs'
print(mode)
mol_template = Chem.MolFromMolFile(mol_fname)
mol_child = Chem.MolFromSmiles(smi_child)
try:
mol = AllChem.ConstrainedEmbed(mol_child, mol_template)
print(mol.GetProp('EmbedRMS'))
except ValueError as e:
print(e)
mode = 'read template mol file, AddHs'
print(mode)
mol_template = Chem.MolFromMolFile(mol_fname)
mol_child = Chem.MolFromSmiles(smi_child)
try:
mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_template)
print(mol.GetProp('EmbedRMS'))
except ValueError as e:
print(e)
mode = 'embed template mol in rdkit, no AddHs'
print(mode)
mol_template = Chem.MolFromSmiles(smi_template)
AllChem.EmbedMolecule(mol_template)
mol_child = Chem.MolFromSmiles(smi_child)
try:
mol = AllChem.ConstrainedEmbed(mol_child, mol_template)
print(mol.GetProp('EmbedRMS'))
except ValueError as e:
print(e)
Output:
iteration 0
read template mol file, no AddHs
Could not embed molecule.
read template mol file, AddHs
0.05014807519735495
embed template mol in rdkit, no AddHs
0.12358989886023371
iteration 1
read template mol file, no AddHs
0.057937898735270194
read template mol file, AddHs
Could not embed molecule. # <-- here rdkit spends a lot of time but
fails
embed template mol in rdkit, no AddHs 0.1012757033705761 Pavel. On 07/07/2020 21:41, Sunhwan Jo wrote:
Makes sense :)On Jul 7, 2020, at 12:35 PM, Sereina Riniker <[email protected] <mailto:[email protected]>> wrote:Dear Pavel and Sunhwan,Please note that hydrogens should always be added for the embedding algorithm to work properly (i.e. it’s not a walk around but what should be done). See also Section “Working with 3D Molecules” in https://www.rdkit.org/docs/GettingStartedInPython.htmlBest regards, SereinaOn 7 Jul 2020, at 21:26, Sunhwan Jo <[email protected] <mailto:[email protected]>> wrote:The reason constraint embed didn’t work is the molecule simply can’t be embedded using the rdkit’s algorithm.In [25]: mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') In [26]: AllChem.EmbedMolecule(mol_child) Out[26]: -1See more discussion here: https://github.com/rdkit/rdkit/issues/2996The SMILES you posted looks valid to me and doesn’t look that complicated, but the anyway I think somehow the RDKit’s algorithm tripped up and couldn’t finish embedding without some help. Hopesomeone with more in-depth insight can help here. Anyway, for a walk around, adding H seems to do the trick:In [39]: mol = AllChem.AddHs(mol_child) In [40]: AllChem.EmbedMolecule(mol) Out[40]: 0 # worked In [41]: AllChem.ConstrainedEmbed(mol, mol_parent) Out[41]: <rdkit.Chem.rdchem.Mol at 0x7fe8000f6f80> # also workedSunhwanOn Jul 7, 2020, at 12:36 AM, Pavel Polishchuk <[email protected] <mailto:[email protected]>> wrote:Hi all,I have an issue with ConstrainedEmbed and I cannot figure out what exactly causes this. I have a molecule C[C@@H]1CCCCC1=O with 3D coordinates in 1.mol file (attached). And I want to generate coordinates for another structure with this core -C[C@@H]1CC[C@H](O)CC1=O.This is usual way which causes issue with embedding and the corresponding error.mol_parent = Chem.MolFromMolFile('1.mol') mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') try: mol = AllChem.ConstrainedEmbed(mol_child, mol_parent) except ValueError as e: print(e) If I add explicit hydrogens the issue disappears. mol_parent = Chem.MolFromMolFile('1.mol') mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_parent) If I do not use pre-defined coordinates - everything works well. mol_parent = Chem.MolFromSmiles('C[C@@H]1CCCCC1=O') AllChem.EmbedMolecule(mol_parent) mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)Does ugly coordinates in 1.mol file cause the embedding issue? Or the issue is caused by some implicit properties of a molecule? How to solve this properly?Kind regards, Pavel. <1.mol>_______________________________________________ Rdkit-discuss mailing list[email protected] <mailto:[email protected]>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss_______________________________________________ Rdkit-discuss mailing list[email protected] <mailto:[email protected]>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
1.mol
Description: MOL mdl chemical test
2.mol
Description: MOL mdl chemical test
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

