Hi Navid,
I think you have a few options. One is to loop over your molecule’s atoms and
delete those hydrogens without any neighbors (degree = 0). In Python this would
look something like the following:
import rdkit
from rdkit import Chem
from rdkit.Chem import rdmolops
# mol = Chem.MolFromSmiles("C#CC(O)C1CCN1.[HH]")
mol = Chem.MolFromSmiles("C#CC(O)C1CCN1.[H].[H]")
disconnected_hydrogens = [atom for atom in mol.GetAtoms() if
atom.GetAtomicNum() == 1 and atom.GetDegree() == 0]
print([atom.GetIdx() for atom in disconnected_hydrogens])
If you know that your dummy hydrogens aren’t connected to the rest of the graph
you could also do the following:
disconnected_fragments = rdmolops.GetMolFrags(mol, asMols=True)
print([Chem.MolToSmiles(fragment) for fragment in disconnected_fragments])
As for using dummy atoms, one thing that comes to mind is using atoms with an
atomic number of 0. Depending on the molecular property you are calculating
this may be good enough. You can set the atomic number with the
atom.SetAtomicNum(0) function.
As a side note, I’m not sure the SMILES you provided is valid. Perhaps you
should separate each hydrogen as their own molecule (see the code above)?
Best regards,
Alan
From: Navid Shervani-Tabar<mailto:[email protected]>
Sent: 09 June 2020 21:47
To: RDKit Discuss<mailto:[email protected]>
Subject: [Rdkit-discuss] Removing disconnected hydrogens
Hello RDKitters,
I'm using a function to convert a molecular graph to RDKit's mol object. Input
molecules have a maximum size of N atoms. Molecules with less than N atoms have
dummy atoms on the corresponding node. Currently, I use hydrogen as the dummy
atom when building the editable RWmol object. This results in hydrogen atoms
without neighbours. An example of such a molecule has SMILES representation
'C#CC(O)C1CCN1.[HH]'. I was wondering
1. How can I remove the hydrogen's without neighbours? These hydrogen are
currently affecting the molecular properties.
2. Is there a better option to use as the dummy atom? Something that
potentially would not affect the molecular properties.
PS: I can't skip the dummy atoms while building the mol object b/c some graphs
mistakenly have bonds connected to these atoms and I need the statistics on the
defective molecules.
Thanks,
Navid
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss