On May 9, 2023, at 07:55, Haijun Feng <haijun20230...@gmail.com> wrote:
> Can anyone help me figure out how to get each atom with H from the smiles as 
> above. Thanks so much!

Try using Chem.MolFragmentToSmiles to get the SMILES for each atom, with all 
hydrogens explicit, then strip off the leading and trailing []s.

from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
  atom_smi = Chem.MolFragmentToSmiles(mol, allHsExplicit=True, 
atomsToUse=[atom.GetIdx()])
  print(i, atom_smi.strip("[]"))

This prints

0 cH
1 cH
2 cH
3 cH
4 cH
5 c
6 C
7 NH2
8 O

Your code showed you using

   atom.SetProp('molAtomMapNumber',str(i))

In the following, I'll set that property *after* getting the atom SMILES, so 
the map is not included as part of the output:

from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
  atom_smi = Chem.MolFragmentToSmiles(mol, allHsExplicit=True, 
atomsToUse=[atom.GetIdx()])
  print(i, atom_smi.strip("[]"))
  atom.SetIntProp("molAtomMapNumber", i)

print(Chem.MolToSmiles(mol))

which gives the output

0 cH
1 cH
2 cH
3 cH
4 cH
5 c
6 C
7 NH2
8 O
[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8]



> the output is: [cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1C:6=[O:8]

For what it's worth, I get the slightly different:

   [cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8] 

You should be aware that the input order and the output SMILES order might be 
different.

Because of the simpler structure of your preferred output SMILES format, you 
can alternatively extract the atom terms from the output string by looking for 
the substrings inside of the []s, as in the following:

import re
>>> re.compile(r'\[[^]]+\]').findall("[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1[C:6]([NH2:7])=[O:8]")
['[cH:0]', '[cH:1]', '[cH:2]', '[cH:3]', '[cH:4]', '[c:5]', '[C:6]', '[NH2:7]', 
'[O:8]']

This list will exactly match the output SMILES atom order.

Cheers,


                                Andrew
                                da...@dalkescientific.com




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to