On May 9, 2023, at 07:55, Haijun Feng <[email protected]> wrote:
> Can anyone help me figure out how to get each atom with H from the smiles as
> above. Thanks so much!
Try using Chem.MolFragmentToSmiles to get the SMILES for each atom, with all
hydrogens explicit, then strip off the leading and trailing []s.
from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
atom_smi = Chem.MolFragmentToSmiles(mol, allHsExplicit=True,
atomsToUse=[atom.GetIdx()])
print(i, atom_smi.strip("[]"))
This prints
0 cH
1 cH
2 cH
3 cH
4 cH
5 c
6 C
7 NH2
8 O
Your code showed you using
atom.SetProp('molAtomMapNumber',str(i))
In the following, I'll set that property *after* getting the atom SMILES, so
the map is not included as part of the output:
from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
atom_smi = Chem.MolFragmentToSmiles(mol, allHsExplicit=True,
atomsToUse=[atom.GetIdx()])
print(i, atom_smi.strip("[]"))
atom.SetIntProp("molAtomMapNumber", i)
print(Chem.MolToSmiles(mol))
which gives the output
0 cH
1 cH
2 cH
3 cH
4 cH
5 c
6 C
7 NH2
8 O
[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8]
> the output is: [cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1C:6=[O:8]
For what it's worth, I get the slightly different:
[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8]
You should be aware that the input order and the output SMILES order might be
different.
Because of the simpler structure of your preferred output SMILES format, you
can alternatively extract the atom terms from the output string by looking for
the substrings inside of the []s, as in the following:
import re
>>> re.compile(r'\[[^]]+\]').findall("[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8]")
['[cH:0]', '[cH:1]', '[cH:2]', '[cH:3]', '[cH:4]', '[c:5]', '[C:6]', '[NH2:7]',
'[O:8]']
This list will exactly match the output SMILES atom order.
Cheers,
Andrew
[email protected]
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss